Rec-dPCT/PTO 18 OCT 2004 

iCT/DK 03 / 0026 6 





Kongeriget Danmark 



Patent application No. : pct/dk 02/00882 
Date Of filing: 1 9 December 2002 



Applicant: Biolmage A/S, Morkhoj Bygade 28, DK-2860 Soborg, 

(Name and address) Denmark, 

v 1 DAHL, S0ren Weis, Bolbrovej 46, DK-2960 Rungsted, 

Denmark 



Title: FLUOROPHORE COMPLEMENTATION PRODUCTS 
IPC: - 

The attached documents are exact copies of the filed application 




PRIORITY DOCUMENT 

SUBMITTED OR TRANSMITTED IN 
COMPLIANCE WITH 
RULE 17.1(a) OR<b) 



Patent- og Varemaerkestyrelsen 

0konomi- og Erhvervsministeriet 

5 August 2003 

Anne-Grethe Warrer-Madsen 
Head Clerk 



BEST AVAILABLE Uurr 



Patent- og Varem/erkestyrelsen 



I wijlsiv \j / \J \J \J U /L 



1016 PC 1 



1 



FLUOROPHORE COMPLEMENTATION PRODUCTS 

Field of invention 

The present invention relates to various split fluorophore complementation products, 
especially ways to obtain intense* systems with Green Fluorescent Protein (GFP). 

5 Background of the invention 

It has been suggested to use the reassembly of certain enzyme fragments of the 
complete enzyme as a measure of protein-protein interactions. Johnsson and Varshavsky 
(Johnsson, N., Varshavsky, A. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 10340-10344) 
disclose reassembly of Ubiquitin. This reassembly is detected through the irreversible 
10 cleavage of the fusion by Ubiquitin protease and release of a reporter. As opposed to the 
two-hybrid technique, this technique includes the possibility of monitoring a protein-protein 
interaction as a function of time, at the natural sites of this interaction in a living cell. 

Similar systems are suggested for the reassembly of other proteins including 0- 
galactosidase (Rossi, F., Charlton, C.A., Blau, H.M. (1997) Proc. Natl. Acad. Sci. U. S. A. 

15 94, 8405-8410), dihydrofolate reductase (DHFR, WO98/34120), and p-lactamase 

(Wehrman, T., Kleaveiand, B., Her, J.H., Balint, R.F., Blau, H.M. (2002) Proc. Natl. Acad. 
Sci. U. S. A. 99, 3469-3474). The basic concept is that by splitting a functional protein in 
two fragments, the function is lost. The two fragments are transformed or transfected into 
cells fused in frame to proteins X and Y, respectively. Binding between proteins X and Y 

20 will bring the two fragments close together, increasing the local concentration of the 
complementing fragments, induce folding of these fragments and produce a functional 
protein with an activity that is similar to that of the non-fragmented protein. If the function 
is DHFR activity, the cells will survive only if proteins X and Y bind to each other. 

Recently, it has been described to use a somewhat similar system for the assisted 
25 reassembly and folding of fragments of fluorescent proteins. As the function is 

fluorescence, the cells will emit light upon excitation only if protein X and protein Y bind to 
each other thereby assisting complementation. 
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Ghosh (I. Ghosh, A.D. Hamilton, L. Regan (2000) J. Am. Chem. Soc. 122, 5658-5659, 
WO01/87919) describes the useof a GFP variant called sg100 (F64L, S65C, Q80R, 
Y151L, I167T and K238N). This GFP has single fluorescence excitation and emission 
peaks at 475 nm and 505 nm, respectively (similar to sg25 described by Palm (Palm, 
5 G.J., Zdanov, A., Gaitanaris, G.A., Stauber, R., Pavlakis, G.N., Wlodawer, A. (1997) Nat. 
Struct. Biol. 4, 361-365)). 

Functional GFP fragment complementation is accomplished by co-expressing two 
independent peptides composed of the first 157 N-terminal amino acids of this GFP 
(NtermGFP) and the remaining 81 C-terminal amino acids (starting form residue 158) of 
10 this GFP (CtermGFP) with each of the GFP peptide fragments being fused to interacting 
leucine zipper peptides that serve to associate the fragments. 

Nagai (T. Nagai, A. Sawano, E.S. Park, A. Miyawaki (2001) Proa Natl. Acad. U. S. A. 98, 
3197-3202) tests a yellow fluorescent GFP variant that has the following mutations: S65G, 
V68L, Q69K, S72A, T203Y. This variant was split between residues N144 and Y145 
15 within the open 129-145 loop region, and the peptides fused to M13 and calmodulin, 
respectively, for use in a Ca- + assay. However, when the constructs were transfected 
individually into HeLa cells, the assay was not reliable. 

Thus, there is a need for alternative GFP's for use in this technology. 

Summary of the invention 

20 The present application discloses that certain GFPs can be reassembled and form a 
functional fluorescent protein when expressed as two independent proteins halves. For 
example, when EGFP is expressed in mammalian cells, choosing a split site located in a 
loop region between the residues that form the beta-sheet structures of the GFP beta- 
barrel results in intense fluorescence (Example 5 and Example 7). The present application 

25 further illustrates that EYFP is also reassembled and, surprisingly, the fluorescence from 
the reassembled protein is markedly enhanced if it contains the F64L mutation (Example 
9). 

The reassembly of proteins does not occur, if the two independent proteins halves are 
fused to non-interacting proteins. But, when brought together, they are reassembled 
30 (Example 11). 
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Detailed disclosure 

The non-fluorescent fragments of fluorescent proteins that can be combined to form one 
functional fluorescent unit are usually produced by splitting the coding nucleotide 
sequence of one fluorescent protein at an appropriate site and expressing each 
5 nucleotide sequence fragment independently. The fluorescent protein fragments may be 
expressed alone or in fusion with one or more protein fusion partners. 

Thus, one aspect of the invention relates to two GFP fragments comprising an N-terminal 
fragment of GFP, comprising a continuous stretch of amino acids from amino acid number 
1 to amino acid number X of GFP, wherein the peptide bond between amino acid number 
1 0 X and amino acid number X+1 is within a loop of GFP, the two GFP fragments also 
comprise a C-terminal fragment of GFP, comprising a continuous stretch of amino acids 
from amino acid number X+1 to amino acid number 238 of GFP. 

Amino acid 1 is meant to indicate the first amino acid of GFP. Amino acid 238 is meant to 
indicate the last amino acid of the GFP. 

15 All residues are numbered according to the numbering of wild type A. victoria GFP 
(GenBank accession no. M62653) and said numbering also applies to equivalent 
positions in homologous sequences exemplified by alignment of fluorescent protein 
sequences in Example 1. Thus, when working with truncated GFPs (compared to wild 
type GFP) or when working with GFPs with additional amino acids, the numbering is 

20 relative to the alignment. 

Green Fluorescent Protein (GFP) is a 238 amino acid long protein derived from the 
jellyfish Aequorea Victoria (SEQ ID NO: 1). However, fluorescent proteins have also been 
isolated from other members of the Coelenterata, such as the red fluorescent protein from 

25 Discosoma sp. (Mate, M.V. et al. 1999, Nature Biotechnology 17: 969-973), GFP from 
Renilla reniformis, GFP from Renilla Muelleri or fluorescent proteins from other animals, 
fungi or plants. The GFP exists in various modified forms including the blue fluorescent 
variant of GFP (BFP) disclosed by Heim et al. (Heim, R. ef a/., 1994, Proc.Natl.Acad.Sci. 
91 :26, pp 12501-12504) which is a Y66H variant of wild type GFP; the yellow fluorescent 

30 variant of GFP (YFP) with the S65G, S72A, and T203Y mutations ( WO98/06737); the cyan 
fluorescent variant of GFP (CFP) with the Y66W colour mutation and optionally the F64L, 
S65T, N146I, M153T, V163A folding/solubility mutations (Heim, R M Tsien, R.Y. (1996) Curr. 
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Biol. 6, 178-182). The most widely used variant of GFP is EGFP with the F64L and S65T 
mutations (WO 97/1 1 094 and W096/2381 0) and insertion of one valine residue after the 
first Met. The F64L mutation is the amino acid in position 1 upstream from the chromophore. 
GFP containing this folding mutation provides an increase in fluorescence intensity when the 
5 GFP is expressed in cells at a temperature above about 30°C (WO 97/1 1094). 

It is known that fluorescence in wild-type GFP is due to the presence of a chromophore,. 
which is generated by cyclisation and oxidation of the SYG at position 65-67 in the predicted 
primary amino acid sequence and presumably by the same reasoning of the SHG sequence 
in other GFP analogues at positions 65-67. 

10 The present examples clearly illustrate how the fluorescence intensity from a reassembled 
protein is enhanced in GFPs containing the F64L mutations as compared to GFPs without 
this mutation. Thus, it is preferred that the GFP contains the F64L mutation, either by 
electing a GFP with this mutation (e.g. EGFP) or to introduce this mutation into the GFP of 
choice (e.g. YFP as illustrated in Example 8). 

15 In the nomenclature of GFP, an "E" is placed in front of the GFP (EGFP, EYFP, ECFP) to 
indicate that this particular GFP is encoded by a nucleic acid with codon usage optimised for 
mammalian cells. Most of these proteins also have an extra valine residue inserted after the 
initial methionine residue, Met 1 . This extra valine residue is not considered in the numbering 
of the residues. Thus, in a preferred embodiment, the GFP of the present invention is 

20 selected from the group consisting of EGFP, EYFP, ECFP, dsRed and Renilla GFP. 

Some of the examples of the present application, EGFP is used. Thus, in a preferred 
embodiment of the invention, the GFP is EGFP. However, Example 8 and Example 11 
show that EYFP has certain advantages. Thus, in another preferred embodiment of the 
invention, the GFP is EYFP. It is also shown that EYFP mutated in position 1 preceding 
25 the chromophore (E[F64L]YFP) has specific advantages. Thus, in a preferred 
embodiment the GFP is E[F64L]YFP. 

In the present context, the numbering of wild-type GFP (SEQ ID NO: 1) (Chalfie, M. ( Tu, 
Y., Euskirchen, G., Ward, W.W., Prasher, D.C. (1994) Science 263, 802-805, this variant 
30 of GFP has a histidine residue in position 231) is used. Based on the crystal structure of 
GFP (Yang, F. f Moss, L.G., Phillips, G.N. (1996) Nat Biotech. 14, 1246-1251) Figure 5, 
Table 1 and the data presented in the examples, it is evident that a split in almost any 
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loop will be re-assembled following appropriate spatial approximation to the 
complementation fragments assisted by the interaction of the conjugated proteins. For the 
purpose of this application the term "loop" shall be understood as a turn or element of 
irregular secondary structure. 

5 Thus, in one aspect, the invention relates to two GFP fragments as described above, 
wherein X is 7, 8, 11 or 12, preferably X is 9 or 10 within the Thr9-Val1 1 loop; or 
wherein X is 21, 22, 25 or 26, preferably X is 23 or 24 within the Asn23-His25 loop; or 
wherein X is 36, 37, 40 or 41 , preferably X is 38 or 39 within the Thr38-Gly40 loop; or 
wherein X is 46, 47, 56 or 57, preferably X is between 48 and 55 i.e. X is 48, 49, 50, 51, 
10 52, 53, 54 or 55 within the Cys48-Pro56 loop; or 

wherein X is 70, 71 , 76 or 77, preferably X is between 72 and 75 i.e. X is 72, 73, 74 or 75 

within the Ser72-Asp76 loop; or 
wherein X is 79, 80, 83 or 84, preferably X is 81 or 82 within the His81-Phe83 loop; or 
wherein X is 86, 87, 90 or 91, preferably X is 88 or 89 within the Met88-GIu90 loop; or 
15 wherein X is 99, 100, 103 or 104, preferably X is 101 or 102 within the LysiOi-AsplOS 

loop; or • 

wherein X is 1 12, 1 13, 1 18 or 1 19, preferably X is between 1 14 and 1 17 i.e. X is 1 14, 115, 

116 or 117 within the Phe114-Thr118 loop; or 
wherein Xls 126, 127, 145 or 146, preferably X is between~r28-and 144 i.e. X is 128, 129, 
20 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144 within the 

lie 128-Tyr145 loop; or 
wherein X is 152, 153, 160 or 161, preferably X is between 154 and 159 i.e. X is 154, 155, 

156, 157, 158 or 159 within the Ala154-Gly160 loop; or 
wherein X is 169, 170, 175, 176, preferably X is between 171 and 174 i.e. X is 171, 172, 
25 173 or 174 within the Ile171-Ser175 loop; or 

wherein X is 186, 187, 197 or 198, preferably X is between 188 and 196 i.e. X is 188, 189, 

190, 191, 192, 193, 194, 195 or 196 within the Ile188-Asp197 loop; or 
wherein X is 208, 209, 215 or 216, preferably X is between 210 and 214 i.e. X is 210, 21 1 , 

212, 213 or 214 within the Asp210-Art215 loop. 



m/UK 02/00802 



1016 PC 1 




6 

Table 1 GFP secondary structures, GFP wild type sequence amino acid numbering, a 
and p indicate cc-helical and fj-sheet secondary structures, respectively. 



Name 


Position 




Helix 1 


Lys3 - Thr9 


o1 


Sheet 1 


Val11 -Asn23 


B1 


Sheet 2 


His25 - Thr38 


B2 


Sheet 3 


Gly40 - Cys48 


33 


Helix 2 


Pro56 - Ser72 


a2 


Helix 3 


Asp76 - His81 


a3 


Helix 4 


Phe83 - Met88 


cc4 


Sheet 4 


Glu90-Lys 101 


34 


Sheet 5 


Asp 103 -Phe114 


35 


Sheet 6 


Thr118-lle128 


36 


Sheet 7 


Tyr145-Ala154 


37 


Sheet 8 


Gly160-lle171 


38 


Sheet 9 


Ser175-lle188 


39 


Sheet 10 


Asp197-Asp210 


310 


Sheet 11 


Arg215-Gly228 


311 



Based on the findings disclosed in the examples, it is concluded that appropriate splitting 
5 sites in GFP are located in the loop regions between the residues that form the beta-5heet~ 
structures of the GFP beta-barrel. Accordingly, splits in GFP are preferably made in the 
Asn23-His25 loop, the Thr38-Gly40 loop, the Lys101-Asp102 loop, the Phe114-Thr118 
loop, the Ile128-Tyr145 loop, the Ala154-Gly160 loop, the Ile171-Ser175 loop, the lle188- 
Asp197 loop or the Asp21 0-Arg21 5 loop (Table 1, Figure 5). 

10 The data in the present examples illustrates clearly that the Ala154-Gly160 loop is very 
well suited for GFP reassembly. This is particularly the case when the GFP is divided 
between amino acids Q157 and K158 (that is, when X is 157). Thus, a preferred 
embodiment of the invention relates to two GFP fragments, wherein X is 157 within the 
Ala154-Gly160 loop. 

15 The data in the present examples also illustrate that the Ile171-Ser175 loop is very useful 
for GFP reassembly. This is particularly the case, when the GFP is divided between 
amino acids E172 and D173 (that is, when X is 172). Thus, a preferred embodiment of the 
invention relates to two GFP fragments, wherein X is 172 within the Ile171-Ser175 loop. 
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As illustrated in Example 9, fragments having overlapping sequences have certain 
advantages. Thus one aspect of the invention relates to two GFP fragments comprising 

(a) an N-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
5 amino acid number 1 to amino acid number X of GFP, wherein the peptide bond between 

amino acid number X and amino acid number X+1 is within a loop of GFP and 

(b) a C-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
amino acid number Y+1 to amino acid number 238 of GFP, wherein Y<X creating an 
overlap of the two GFP fragments, and wherein the peptide bond between amino acid Y 

1 0 and amino acid Y+1 is within a loop of GFP. 

. These overlapping GFP fragments are very attractive in e.g. functional cloning systems 
where highly flexible linkers sequences are required due to the very diverse nature of the 
structures of fusion partners. The overlapping fragments permit either of the fusion 
partners to have a long linker sequence. 

15 For the purposes of deciding the nature of the Y in the G-terminal fragment of GFP 
defined above, the same considerations as discussed for the value of X applies. 

In one embodiment of the invention the overlap is just a few amino acid residues, e.g. X-Y 
=1 , X-Y=2, X-Y=3, X-Y=4, X-Y=5, X-Y=6, X-Y=7, X-Y=8, X-Y=9 or X-Y=10. 

Due to the folding characteristics of the folding of GFP, a preferred embodiment of the 
20 invention relates to overlapping N-terminal and C-terminal fragments of GFP wherein the 
peptide bond between amino acid Y and amino acid Y+1 and the peptide bond between 
amino acid X and amino acid X+1 is within a loop of GFP. The thereby obtained overlap is 
an entire a-helix or fi-sheet secondary structure 

25 In order to obtain reassembly of the two halves of GFP, it is preferred to have the two 
halves of GFP fused to interaction partners that will bring said two halves of GFP so close 
together that the protein halves will fold and form functional GFP. Thus, a preferred 
embodiment of the invention relates to a fusion protein comprising an N-terminal fragment 
of GFP as described above conjugated to a first protein of interest. In a particular 

30 embodiment the nucleic acid encoding the N-terminal fragment of GFP is fused in frame 
to the first protein of interest. In similar embodiments, the present invention relates to two 
GFP fragments as described above, wherein the C-terminal fragment of GFP is 
conjugated to a second protein of interest. In a particular embodiment, the nucleic acid 



1016 PC 1 




CT/OK 02/00882 



r 



8 

encoding the C-terminal fragment of GFP is fused in frame to the second protein of 
interest. 

As will be evident to the skilled person, the protein of interest is conjugated to the GFP 
fragment in the N-terminal or in the C-terminal. However, as illustrated in the examples, 
5 conjugation of the first protein of interest to the N-terminal fragment of GFP shall 

preferably be to the C-terminal of the N-terminal fragment of GFP. Likewise, conjugation 
of the second protein of interest to the C-terminal fragment of GFP shall preferably be to 
the N-terminal of the C-terminal fragment of GFP. 

As will be evident from the present examples the protein of interest is a protein, a peptide 
10 or a non-proteinaceous partner. 

In a typical embodiment of the invention, the conjugated protein as described above, 
wherein the fragment of GFP is conjugated to a protein of interest, further comprises a 
linker sequence between either fragment of GFP and the corresponding protein of 
15 interest. 

The linker must be chosen dependent on the protein of interest conjugated to the 
fragment of GFP. Thus the linker must be flexible. A long linker prevent steric hindrance of 

the complementation due to the protein of interest. However short linkers keeps the 

fragments of GFP closer to each other and gives better associations. 

20 

The present invention also relates to the N-terminal fragment of GFP as described above. 
In a similar embodiment, the invention relates to the C-terminal fragment of GFP as 
described above. 

25 A preferred embodiment of the invention relates to a nucleic acid encoding any of the 
fragments or fusions proteins described above. In one embodiment, the nucleic acid 
construct encoding any of the proteins according to the invention described above is a 
DNA construct. In another embodiment, the nucleic acid construct encoding any of the 
proteins according to the invention described above is a RNA construct. 



30 



One aspect of the invention relates to a cell containing the two GFP fragments described 
above. In similar embodiments, the invention relates to a cell containing the N-terminal 
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fragment of GFP described above. In similar embodiments, the invention relates to a cell 
containing the C-terminal fragment of GFP described above. 

Numerous cell systems for transfection exist. A few examples of mammalian cells isolated 
directly from tissues or organs taken from healthy or diseased animals (primary cells), or 
5 transformed mammalian cells capable of indefinite replication under cell culture conditions 
(cell lines). The term "mammalian cell" is intended to indicate any living cell of mammalian 
origin. The cell may be an established cell line, many of which are available from The 
American Type Culture Collection (ATCC, Virginia, USA) or similar Cell Culture 
Collections. The cell may be a primary cell with a limited life span derived from a 

10 mammalian tissue, including tissues derived from a transgenic animal, or a newly 
established immortal cell line derived from a mammalian tissue including transgenic 
tissues, or a hybrid cell or cell line derived by fusing different cell types of mammalian 
origin e.g. hybridoma cell lines. The cells may optionally express one or more non-native 
gene products, e.g. receptors, enzymes, enzyme substrates, prior to or in addition to the 

15 fluorescent probe. Preferred cell lines include, but are not limited to, those of fibroblast 
origin, e.g. BHK, CHO, BALB, NIH-3T3 or of endothelial origin, e.g. HUVEC, BAE (bovine 
arteryeridothelial), CPAE (cow pulmonary artery endothelial), HLMVEC (human lung 
micro vascular endothelial cells), or of airway epithelial origin, e.g. BEAS-2B, or of . 
pancreatic origin, e.g. RIN, INS-1, MIN6, bTC3, aTC6, bTC6, HIT, or of hematopoietic 

20 origin, e.g. primary isolated'human monocytes, macrophages, neutrophils, basophils, 
eosinophils and lymphocyte populations, AML-14, AML-193, HL-60, RBL-1, U937, RAW, 
JAWS, or of adipocyte origin, e.g. 3T3-L1 , human pre-adipocytes, or of neuroendocrine 
origin, e.g. AtT20, PC12, GH3, muscle origin, e.g. SKMC, A10, C2C12, renal origin, e.g. 
HEK 293, LLC-PK1, or of neuronal origin, e.g. SK-N-DZ, SK-N-BE(2), HCN-1A, NT2/D1. 

25 The examples of the present invention are based on CHO cells. Therefore, fibroblast 
derived cell lines such as BALB, NIH-3T3 and BHK ceils are preferred. 

It i s p re f erre( j that the heterologous conjugates are introduced into the cell as plasmids, 
e.g. individual plasmids mixed upon application to cells with a suitable transfection agent 
30 such as FuGENE so that transfected cells express and integrate all heterologous 

conjugates (or GFP fragments) simultaneously. Plasmids coding for each conjugate will 
contain a different genetic resistance marker to allow selection of cells expressing those 
conjugates. It is also preferred that each of the conjugates also contains a distinct amino 
acid sequence, such as the HA or myc or Flag markers, that may be detected 
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immunocytochemically so that the expression of these conjugates in cells may be readily 
confirmed. Many other means for introduction of one or both of the conjugates are evenly 
feasible e.g. electroporation, calcium phosphate precipitate, microinjection, adenovirus 
and retroviral methods, bicistronic plasmids encoding both conjugates etc. 

Throughout the present invention, the term "protein" should have the general meaning. 
That includes not only a translated protein, a peptide or a protein fragment, but also 
chemically synthesized proteins. For proteins translated within the cell, the naturally, or 
induced, post-translational modifications such as glycosylation and lipidation are expected 
10 to occur and those products are still considered proteins. 

The term "intracellular protein interaction" has the general meaning of an interaction 
between two proteins, as described above, within the same cell. The interaction is due to 
covalent and/or non-covalent forces between the protein components, most usually 
15 between one or more regions or domains on each protein whose physico-chemical 
properties allow for a more or less specific recognition and subsequent interaction 
between the two protein components involved. In a preferred embodiment, the 
intracellular interaction is a protein-protein binding. 

20 The recording of the fluorescence will vary according to the purpose of the method in 
question. In one embodiment the emitted light is measureci with various apparatus known 
to the person skilled in the art. Typically, such apparatus comprises the following 
components: (a) a light source, (b) a method for selecting the wavelength(s) of light from 
the source that will excite the luminescence of the luminophore, (c) a device that can 

25 rapidly block or pass the excitation light into the rest of the system, (d) a series of optical 
elements for conveying the excitation light to the specimen, collecting the emitted 
fluorescence in a spatially resolved fashion, and forming an image from this fluorescence 
emission (or another type of intensity map relevant to the method of detection and 
measurement), (e) a bench or stand that holds the container of the cells being measured 

30 in a predetermined geometry with respect to the series of optical elements, (f) a detector 
to record the light intensity, preferably in the form of an image, (g) a computer or 
electronic system and associated software to acquire and store the recorded information 
and/or images, and to compute the degree of redistribution from the recorded images. 
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In one embodiment of the invention, the actual fluorescence measurements are made-in a 
standard type of f luorometer for plates of micro titer type (fluorescence plate reader). 

In one embodiment, the optical scanning system is used to illuminate the bottom of a plate 
5 of micro titer type so that a time-resolved recording of changes in luminescence or 
fluorescence can be made from all spatial limitations simultaneously. 

In one embodiment, the image is formed and recorded by an optical scanning system. 

A variety of instruments exist to measure light intensity. In one embodiment a 
10 fluorescence plate reader is used (e.g. Wallac Victor (BD Biosciences), Spectrafluor 

(Tecan), Flex station (Molecular Devices), Explorer (Acumen)). In another embodiment an 
imaging plate readers is used (e.g. FLIPR (Molecular Devices) LeadSeaker (Amersham), 
VIPR (Molecular Devices)). In another embodiment an automated imager is used like 
Arrayscan (Cellomics), Incell Analyser (Amersham), Opera (Evotec). In a still further 
15 embodiment a confocal fluorescence microscope is used (e.g. LSM510 (Zeiss)). 

One aspect of the invention relates to a method for detecting the interaction between two . ^ 
proteins of interest comprising the steps of: 

(a) providing at least one cell that contains two heterologous conjugates, 

20 -the first heterologous conjugate comprising a first protein of interest conjugated' to "an 
N-terminal fragment of GFP as described above, 

the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP as described above; and 

(b) measuring the fluorescence from the at least one cell, 

25 fluorescent cells indicating interaction between the two proteins of interest. 

In a similar embodiment, the invention relates to a method for monitoring the interaction 
between two proteins of interest comprising the steps of: 

(a) providing at least one cell containing at least one stretch of nucleic acid encoding two 
heterologous conjugates: 
30 the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP as described above, 

the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP as described above; 



i 
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(b) culturing the at least one cell under conditions allowing expression; and 

(c) measuring the fluorescence from the at least one cell, 

fluorescent cells indicating interaction between the two proteins of interest. 

In one aspect of the methods, one of the proteins of interest is known, whereas the other 
5 protein of interest is an unknown protein. By parallel transfection of the cells with both 
heterologous conjugates, cells expressing an unknown protein that interacts with the 
know protein of interest will be fluorescent and thereby easily detectable. In an alternative 
embodiment of the invention, a cell line is established that stabitly expresses the 
heterologous conjugate comprising the known protein of interest and a library of 
1 0 heterologous conjugates comprising the potential interaction partners is then transfected 
into the cells - one per well. 

As clearly illustrated in the present examples, the method is useful in detecting 
compounds that induce interaction between two proteins of interest. Such method 
comprises the steps of: 
15 (a) providing at least one cell that contains two heterologous conjugates, 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP as described above, 

the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP as described above; and 
20 (b) measuring the fluorescence from the at least one cell of step (a), 

(c) apply a test compound to the at least one cell of step (b) 

(d) measuring the fluorescence from the at least one cell of step (c); 

an increase in fluorescence observed from step (b) to step (d) indicating that the test 
compound added in step (c) is capable of inducing interaction between the two proteins of 
25 interest. 

If a compound that induces interaction between two proteins of interest is known and 
available, this compound can be useful as a reference compound for the method for 
detecting compounds that induce interaction between two proteins of interest 

In a case where a compound that induces interaction between two proteins of interest is 
30 known, it also opens the possibility to screen for compounds that interfere with a 
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conditional interaction between two protein components. Such method comprises the 
steps of: 

(a) providing at least one cell that contains two heterologous conjugates, 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
5 N-terminal fragment of GFP as described above, 

the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP as described above; and 

(b) measuring the fluorescence from the at least one cell of step (a), 

(c) apply a test compound and the compound that induces interaction between two 
1 0 proteins of interest to the at least one cell of step (b) 

(d) measuring the fluorescence from the at least one cell of step (c); 

an increase in fluorescence observed from step (b) to step (d) indicating that the test 
compound added in step (c) does not prevent interaction between the two proteins of 
interest; whereas an increase in fluorescence observed from step (b) to step (d), which 
15 increase is less compared to the increase in fluoresence observed when the test 
compound is absent and only the compound thai induces interaction ;s present, is 
indicating that the test compound will interfere with the induced interaction between the 
two proteins of interest 

20 One particular advantage of the present method is that it can be carried out in a 

heterogeneous cell population. This avoids inter alia the steps required to get clonal cells. 
This is achieved by fluorescence activated cell sorting (FACS) prior to testing. One step in 
that process is removal of the most green cells, that is the cells wherein t functional 
fluorescence is achieved even though the two proteins of interest were not supposed to 

25 interact. Another step is removal of the black cells, that is the cells wherein the two 

heterologous conjugates do not interact e.g. where no or little functional complementation 
occurs. This could be due to lack of transfection in those cells, a poor expression ratio 
between the two constructs, or lack of functional expression of either construct. It is 
presently anticipated that, in both the most green cells and the black cells, the transfection 

30 has not taken place as desired, resulting in no, poor, or excessive complementation of the 
heterologous conjugates. The hereby obtained "medium to low-green" cells are then used 
in any of the methods described above, or other complementation based methods. The 
"most green", "medium green", "low green" and "black" cells respectively have decreasing 
levels of fluorescence relative to on another. These levels are predetermined by the 

35 skilled artisan in relative proportions 
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The preferred method for detecting interactions between proteins of interest include an 
additional FACS. The aim of this second FACS step is to isolate cells with a large 
dynamic range. The first step is stimulating the "medium to low-green" FACS ceils with 
the compound that induce interaction between two proteins of interest and thereafter allow 

5 sufficient time to pass to let the proteins interact and the fluorescent protein fragments fold 
and become fluorescent. The next step is to subject them to the second FACS step 
removing the most green cells. The remaining population of cells will have a low to 
medium background and are still capable of forming the fluorescent protein upon 
interaction between the two proteins of interest. When the cells have grown to sufficient 

10 number, and a number of generations will have diluted the fluorescence, the cells are 
ready to use in any of the methods outlined above, e.g-. detecting compounds that induce 
interaction between two proteins of interest and to screen for compounds that interfere 
with a conditional interaction between two protein components. 

In a preferred aspect of the methods, the at least one cell is a mammalian cell. 

15 The term "compound" is intended to indicate any sample, that has a biological function or 
• exerts a biological effect in a cellular system. The sample may be a sample of a biological >. . 
material such as a sample of a body fluid including blood, plasma, saliva, milk, urine, or a 
microbial or plant extract, an environmental sample containing pollutants including heavy 
metals or toxins, or it may be a sample containing a compound or mixture of compounds 

20 prepared by organic synthesis or genetic techniques. The compound may be small or- 
ganic compounds or biopolymers, including proteins and peptides. 

In another preferred aspect of the methods, the heterologous conjugates are fusion 
proteins. 

25 This technology has broad applicability. Due to the direct detection of interactions it can 
be used in genomics and proteomics. The high sensitivity makes it applicable to target 
discovery and the high specificity makes it applicable to target validation. It can be scaled 
to Drug Discovery in High Throughput Screening. The technology is quantitative and 
makes it applicable to nanotechnology and diagnostics. 



30 



The invention will be illustrated more specifically in the following non-limiting examples. 
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Examples 

Example 1: Alignment of fluorescent proteins 



GenBank entry Fluorescent protein 



P42212 


Aequorea victoria green-fluorescent protein 


AF372525 


Renilla reniformis green fluorescent protein 


AY015996 


Reniila muelleri green fluorescent protein 


AYO 13824 


Aequorea macrodactyia isolate GFPxm 


AF384683 


Montastraea cavernosa green fluorescent protein 


AF401282 


Montastraea faveolata green fluorescent protein 


AY015995 


Ptilosarcus sp. CSG-2001 green fluorescent protein 


AF322221 


Anemonia sulcata green fluorescent protein asFP499 


AF322222 


Anemonia sulcata nonfluorescent red protein asCP562 


AF246709 


Anemonia sulcata GFP-like chrombpfotein FP595 


AF168419 


DsRed Discosoma sp. fluorescent protein FP583 


AF1 68420 


Dlscosoma striata fluorescent protein FP483 


AF168421 


Anemonia majano fluorescent protein FP486 


AF1 68422 


Zoanthus sp. fluorescent protein FP506 


AF1 68423 


Zoanthus sp. fluorescent protein FP538 


AF1 68424 


Clavularia sp. fluorescent protein FP484 



The alignment is presented in Figure 16. 



^pT/DK 02 /00882 



5 Example 2: Construction ofEGFP complementation fragment probes 

Anti-parallel leucine zippers (called NZ and CZ) that can bind to each other within 
prokaryotic and eukaryotic were fused to different fragments of GFP to evaluate the 
optimal site for splitting GFP for use of such fragments in molecular complementation 
experiments, including bimolecular fluorescence complementation experiments. NZ and 

10 CZ leucine zippers were prepared by annealing and ligating phosphorylated oligo 

nucleotides 2110-2115 (for NZ zipper, see Table 2) or phosphorylated oligo nucleotides 
21 16-2121 (for CZ zipper), into Ncol-BamHI cut pTrcHis-A vector (commercially available 
from Invitrogen) producing vector PS1515 (expression vector encoding NZ zipper) or 
PS1516 (expression vector encoding CZ zipper). The oligos ligated in NZand CZ 

15 annealing mixes 1 produced the coding sequences of the N-terminal parts of the NZ and 
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CZ zippers. The oligos ligated in NZ and CZ annealing mixes 2 produced the coding 
sequences of the middle parts of the NZ and CZ zippers and the oligos ligated in NZ and 
CZ annealing mixes 3 produced the coding sequences of the C-terminal parts of the NZ 
and CZ zippers. 

5 Annealing primer pairs for NZ zipper 



NZ annealing mix 1 

Forward oligo 21 10 (1 fiM) 5 \i\ 

Reverse oligo 2111(1 5 p\ 

50 mM Tris-HCI, 10 mM MgCI 2 , pH 8.0 2 \i\ 

H 2 0 8 pi 

NZ annealing mix 2 

Forward oligo 21 1 2 (1 \xM) 5 

Reverse oiigo 2i 13 (1 \xi\n) 5 pJ 
50 mM Tris-HCI, 10 mM MgCI 2l pH 8.0 2 jjJ 

H z O 8 |il 

10 NZ annealing mix 3 

Forward oligo 21 14 (1 5 \i\ 

Reverse oligo 21 15 (1 \M) 5 \i\ 
50 mM Tris-HCI, 10 mM MgCI 2l pH 8.0 2 ^lI 

H 2 Q 8 \i\ 



Each of the annealing mixes were heated at 80°C for 2 minutes on a pre-heated Hybaid 
OmniGene PCR machine which was subsequently turned off and allowed to cool to room 
temperature (about 10 min). The fragments were subsequently put on ice. 
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Annealing primer pairs for CZ zipper 
CZ annealing mix 1 

Forward oligo 21 16 (1 |iM) 5 
Reverse oligo 21 1 7 (1 \M) 5 jil 

50 mM Tris-HCI, 10 mM MgC! 2 , pH 8.0 2 
H 2 0 8 |il 
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5 



CZ annealing mix 2 

Forward oligo 21 1 8 (1 \xM) 5 p.! 

Reverse oligo 21 19 (1 ^iM) 5 jil 

50 mM Tris-HCI, 10 mM MgCI 2 , pH 8.0 2 jil 

H 2 0 8 ]x\ 

CZ annealing mix 3 

Forward oligo 2120 (1 jliM) 5 jil 

Reverse oligo 2121 (1 \xM) 5 |Jtl 

50 mM Tris-HCI, 1 0 mM MgCI 2 , pH 8.0 2 \i\ 

H 2 Q 8 |xl 



Each of the annealing mixes were heated at 80°C for 2 minutes on a pre-heated Hybaid 
OmniGene PCR machine which was subsequently turned off and allowed to cool to room ■ 
10 temperature (about 10 min). The fragments were subsequently put on ice. . 

Restriction digestion ofpTrcHis-A prokaryotic expression vector 

The pTrcHis-A prokaryotic expression vector, cut with Ncol and BamHI restriction 
enzymes and gel purified, was used for cloning the prepared NZ and CZ leucine zipper 
coding sequences: 
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Restriction digestion of pTrcHis-A vector 

pTrcHis-A (1 \lqI\l\) 2 M-l 

Nool(10U/|il) 1 P-l 

Nhel (5 U/jil), optional 0.5 jil 

BamHI (20 U/|xI) 1 H*' 

IDOxBSA 0.4 |il 

10x NEB (New England Biolabs, NEB) BamHI buffer 3 \x\ 

H 2 0 23 \x\ 

Calf intestinal phosphatase (optional, last 20 min only) 0.5 \x\ 

The vector was digested for about 1 hour at 37°C and purified by agarose gel 
electrophoresis. The desired vector fragment was recovered from the gel using the 
5 QIAquick Gel Extraction kit (spin columns) from Qiagen and recovered in 50 pj of elution 
buffer. Nhel, which cuts between Ncol and BamHI, was included to minimise the amounts 
of uncut and self-ligating vector. 

Ligation and transformation of annealed NZ oligo pairs 

Each of the three NZ annealing mixtures 1-3 was diluted 50-fold (1 p.l of mixture in 50 \i\ of 
10 H 2 0) and mixed and ligated into the cut vector as follows (three hours at 20-24°C): 

Ligation of NZ zipper fragments into pTrcHis-A vector 

Annealing mix 1 1 \x\ 

Annealing mix 2 1 |il 

Annealing mix 3 1 |xl 

10x T4 DNA ligase buffer (New England Biolabs) 1 p] 

T4 DNA ligase (400 U/fil, New England Biolabs) 0.5 \i\ 

pTrcHis-A (Ncol + BamHI cut) 0.5 \i\ 

H 2 0 5 \x\ 



Alternatively, the fragments in NZ annealing mixes 1, 2, and 3 can be ligated in absence 
of vector and purified by agarose gel electrophoresis before being ligated into the Ncol- 
15 BamHI cut vector. The annealed and ligated oligo nucleotides from annealing mixes 1-3 
had single stranded terminal overhangs that were compatible with the overhangs that 
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were generated by Ncol and BamHI restriction digestion of pTrcHis-A. After ligation of the 
fragment into cut pTrcHis-A, the Ncol and BamHI sites were regenerated. 

Following ligation into the vector, 2 jjJ of the ligation mixture was transformed into 50 \xl of 
One Shot TOP10 chemically competent E; coli cells (Invitrogen) following the 
5 manufacturers protocol. The ligation can be performed using different amounts or 
volumes fragments and buffers. The inserted DNA sequence (SEQ ID NO: 7) and the 
encoded NZ zipper peptide (SEQ ID NO: 8) are as follows: 

MAGGTGSGALKKELQANKKE 
CCATGGC CGGTGGT ACCGGTT C 

LAQLKWE LQALKKE L A Q ' * D 
CTGGC C QAGCTGAAGTGGGAGCTGCAGGCC CTGAAGAAGGAGCTGGCC CAGT AGGATC C 



The Gly-GIy-Thr-Gly-Ser-Gly amino acid sequence in the terminus is part of the linker 
15 sequence that was inserted between the NZ zipper peptide and the N-terminal fragments 
of EGFP (NtermEGFP). The zipper sequence in the NtermEGFP-NZ fusion protein is also 
Gly-Gly-Thr-Gly-Ser-Gly with the Gly-Gly-Thr-Gly coding sequence being repeated in the 
NtermEGFP reverse amplification primers 2129, 2130, and 2131 (Table 3). Underlined 
are the unique Ncol (CCATGG), Agel (accggt) and BamHI (ggatcc) sites used for 
20 cloning of the zipper peptide into pTrcHis-A and the NtermEGFP-NZ fragments into the 
NZ zipper vector PS1515 (see below). The asterisk (*) shows a stop codon. 

Ligation and transformation of annealed CZ oligo pairs 

Each of the three CZ annealing mixtures 4-6 was diluted 50-fold (1 \i\ of mixture in 50 \i\ of 
H 2 0) and mixed as follows: 
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Ligation of CZ zipper fragments into pTrcHis-A vector 



CZ annealing mix 1 1 nl 

CZ annealing mix 2 1 H-l 

CZ annealing mix 3 1 \d 

1 0x T4 DNA ligase buffer (New England Biolabs) 1 pj 

T4 DNA ligase (400 U/jil, New England Biolabs) 0.5 \i\ 

pTrcHis-A (Ncol + BamHI cut) 0.5 \i\ 

H 2 Q 5 [i\ 



Alternatively, the fragments in CZ annealing mixes 1, 2, and 3 can be ligated in absence 
of vector and purified by agarose gel electrophoresis before being ligated into the Ncol- 
5 BamHI cut vector. The annealed and ligated oligo nucleotides from annealing mixes 1-3 
had single stranded terminal overhangs that were compatible with the overhangs that 
were generated by Ncol and BamHI restriction digestion of pTrcHis-A. After ligation of the 
fragment into cut pTrcHis-A, the Nco! and BamHI sites were regenerated. 

Following ligation into the vector, 2 \i\ of the ligation mixture were transformation into 50 p.l ■ - • 
10 of One Shot TOP10 chemically competent E. coli cells (Invitrogen) following the 
manufacturers protocol. The ligation can be performed using different amounts or 
volumes fragments and buffers. The inserted DNA sequence (SEQ ID NO: 9) and the 
encoded CZ zipper peptide (SEQ ID NO: 10) are as follows: 

MASEQLE KKLQALEKKLAQIi 
1 5 CCATGGCCAGCGAGCAGCTGGAGAAGAAGCTGCAGGCCCTGGAGAAGAAGCTGGCCCAGCTG 

EWKNQALE KKLAQGGTG * 
GAGTGGAAGAACCAGGCCCTGGAGAAGAAGCTGGCCCAGGGCGGCACCGCTTAGGATCC 



20 The G!y-Gly-Thr-Gly amino acid sequence in the terminus is part of the linker sequence 
that was inserted between the CZ zipper peptide and the C-terminal fragments of EGFP 
(CtermEGFP). The zipper sequence in the CZ-CtermEGFP fusion protein is also Gly-Gly- 
Thr-Gly with the Thr-Gly coding sequence being repeated in the CtermEGFP forward 
amplification primers 2133, 2134, and 2135 (Table 3). Underlined are the unique Ncol 

25 (ccatgg), Agel (accggt) and BamHI (ggatcc) sites used for cloning of the zipper 
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peptide into pTrcHis-A and the CZ-CtermEGFP fragments into the CZ zipper vector 
PS1516 (see below). The asterisk (*) shows a stop codon. 

Example 3: E. coli colony PCR screen, plasmid miniprep and DNA 
sequencing 

5 The transformed bacteria were plated on Luria Broth (LB) agar plates containing 100 
ug/ml of carbenicillin as selection (present in used E. coli media). To quickly identify 
transformants containing the desired NZ or CZ constructs, colony PCR screening was 
performed using oligos 21 10 (5' forward NZ oligo) and 2115 (3' reverse NZ oligo) or using 
oligos 21 16 (5' forward CZ oligo) and 2121 (3' reverse CZ oligo): 



Per sample (15 ul reaction volume) 




10x Taq polymerase buffer (Perkin Elmer) 


1.5 ul 


dNTP (5 mM nucleotide, each) 


0.3 pi 


50 mM MgCI 2 


0.6 pi 


Dimethyl sulphoxide (DMSO) 


0.3 ul 


Taq polymerase (Perkin Elmer) 


0.2 pi 


5' forward primer (10 \xM) 


0.5 pi 


3' reverse primer (10 p,M) 


0.5 ul 


H 2 0 


6.1 |Lll 


Transformant resuspended in H 2 0 


5.0 pi 



Cycling parameters (RoboCvcler Gradien t 96. Strataaene^ 
Initial denaturation at 94°C for 3 min followed by 25 cycles of (all steps of 1 min): 
Denaturation at 94°C, primer annealing at 53°C and primer extension at 72°C. 
15 Finally, an additional extension step at 72°C was included (5 min). 

16 NZ transformants and 16 CZ transformants were screened. PCR fragments having the 
expected product sizes of about 120 base pairs were amplified from 14 NZ clones and 15 
CZ clones, as determined by agarose gel electrophoresis analysis. 

Three of the positive colonies were picked from each transformation (NZ and CZ) and 
20 used to inoculate 5 ml of liquid LB medium. After culturing at 37°C over night, plasmid 
DNA was purified by mini preparations using the QIAprep kit from Qiagen. 
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Plasmids containing correct NZ (PS1515) or CZ (PS1516) fragment inserts were identified 
by DNA sequencing on an ABI PRISM model 377 DNA sequencer using forward 
sequencing primer 1282. 

Example 4: Prokaryotic expression vectors encoding fusion proteins of 

5 EGFP fragment and zipper 

The DNA sequences encoding the NZ and CZ zippers in the prokaryotic expression 
vectors PS1515 and PS1516, respectively, can be fused to DNA sequences encoding 
desired EGFP fragments (N-terminal fragments of EGFP are called NtermEGFP and C- 
terminal fragments of EGFP are called CtermEGFP) or other fragments using the unique 
10 Agel restriction sites appropriately located in linker sequences in either the 5' end (as in 
the NZ vector PS1515) or in the 3' end (as in the CZ vector PS1516) of the leucine zipper 
coding sequence in combination with either of the unique Ncol or BamHI sites used for 
cloning the zipper coding fragments (DNA and amino acid sequences are shown above). 
The general structures of the fusion protein coding sequences are shown in Figure 1. 

•15 For example, to prepare a prokaryotic expression vector encoding a fusion protein 

consisting of NZ zipper N-terminally fused to an NtermEGFP fragment (that is, fused to 
the C terminal of the NtermEGFP fragment), e.g. residues 1-172 (NtermEGFP 172), this 
regrorr of the EGFP coding-sequence in the commercial expression vector pEGFP-C1 
(Clontech) was amplified by PCR using forward oligo 2128 (containing a unique Ncol site) 
20 and reverse oligo 2131 (containing a unique Agel site) in accordance with Table 3. 

Per sample (50 ul reaction volume) 

10x Pfu polymerase buffer (Stratagene) 5.0 jal 

dNTP (5 mM nucleotide, each) 1 .0 \x\ 

Pfu Hot Start polymerase (Stratagene) 1 .0 \i\ 

5 1 forward primer (1 0 1 .0 \x\ 

3' reverse primer (10 \xM) 1 .0 \i\ 

pEGFP-C1 vector (10 ng/jxl) 2.0 ul 

H 2 Q 39.0 \x\ 



Cycling parameters (Hybaid QmniGene PCR machine) 

Initial denaturation at 94°C for 3 min followed by 25 cycles of (all steps of 1 min): 
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Denaturation at 94°C, primer annealing at 53°C and primer extension at 72°C. 
Finally, an additional extension step at 72°C was included (5 min). 



The PCR fragment encoding the desired EGFP fragment, e.g. the above mentioned 
5 fragment composed of residues 1-172, with appropriately engineered terminal restriction 
sites contained in the primer sequences was then gel purified as described above cut with 
Ncol and Agel or Agel and BamHI and ligated into the constructed NZ or CZ prokaryotic 
leucine zipper expression vectors PS1515 or PS1516 cut with the same enzymes and gel 
purified: 

10 Restriction digestion of NtermEGFP and CtermEGFP PCR fragments 

EGFP fragment (gel purified) 26 \i\ 

Ncol (1 0 U/jal) or BamHI (20 U/jil) 0.5 \i\ 

Agel (10 U/|aI) 1.0 fil 

10x New England Biolabs buffer 2 3 



Restriction digestion of NZ (PS1515) and CZ (PS1516) vectors 

Vector (1 \igl\x\) 1£^l 

Ncol (10 U/|xl) or BamHI (20 U/pJ) 0.33 |il 

Agel (10 U/jxI) 0.66 pi 

1 0x New England Biolabs buffer 2 1 

H 2 Q 7 



All enzymes were from New England Biolabs. The DNA preparations were digested for 1 
15 hour at 37°C and gel purified. 
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Ligation of EGFP fragments into cut PS1515 or PS1516 vector 

Cut and purified vector 2 jil 

Cut and purified NtermEGFP or CtermEGFP 4 \i\ 
fragment 

10x T4 DNA ligase buffer (New England Biolabs) 1 \i\ 

T4 DNA ligase (400 U/^il, New England Biolabs) 0.5 jlxI 

H 2 Q 2.5 |il 



Ligation proceeded for 30 min at 22°C after which 2 ]i\ of each ligation mixture were 
transformed into 50 |nl of One Shot TOP10 chemically competent E. coli cells (Invitrogen). 
5 The transformed cells were plated on LB plates containing carbenicillin and plasmids were 
prepared from two colonies from each transformation as described above. 



Example 5: EGFP based bimolecular fluorescence complementation in E. 
coli 

Plasmids that expressed functional NtermEGFP-NZ or CZ-CtermEGFP complementation 
10 constructs were identified by co-transforming 10 \i\ of One Shot TOP10 chemically 
competent E coli cells (Invitrogen) with 1 |il of each of appropriately matched 

NtermEGFP-NZ or CZ-CtermEGFP plasmids (i.e., plasmids thatexpress EGFP. 

fragments, said fragments are truncated after (NtermEGFP fragments) or before 
(CtermEGFP fragments) the same splitting site and plating the co-transformed cells on LB 
15 plates containing carbenicillin and 5 mM of isopropyl-ft-thiogalactoside (IPTG). 

The transformed cells were grown over night at 37°C. E. coli colonies that were green 
fluorescent because of EGFP based bimolecular fluorescence complementation were 
visible on the agar plate without magnification about 10-20 hours after transfection (the 
fluorescence developed further during storage of the plates at 5°C for one or more days) 
20 when illuminated with a blue light source (Fiberoptic-Heim LQ2600) and viewed through 
yellow filter glasses. 

Functional complementation was clearly visible in cells co-transformed with 
complementation constructs based on splits between either residues 157 and 158 or 
between residues 172 and 173 and the DNA sequences of expression vectors that 
25 produced functional NtermEGFP-NZ or CZ-CtermEGFP complementation fragments 
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(named PS1594, PS1595, PS1596, PS1597, see Table 4) were verified by DNA 
sequencing using primer 1282 as previously described. 

Surprisingly, the E. coli colonies of cells co-transformed with the vectors expressing the 
EGFP complementation fragments with split in the Ile171-Ser175 loop (namely between 
5 residues 172 and 173, vectors PS 1595 and PS 1597) were significantly more fluorescent 
than'the colonies of cells that were co-transformed with vectors expressing EGFP 
complementation fragments that were split in the AIa154-Gly160 loop (namely between 
residues 157 and 158, vectors PS1594 and PS1596). 

Functional complementation was not clearly visible in cells co-transformed with 
10 complementation constructs based on a split between residues 144 and 145. DNA 

sequencing confirmed that expression vectors PS1614 and PS1615 encoded the correct 
NtermEGFP-NZ and CZ-CtermEGFP complementation fragments, respectively. 



CAdfiff/ft; \Jm curxctg y%su%* ecAjuur cooivii vcuiuta cm/vwn m ******* mm vkwmw ^« * 

fragment and zipper 

15 Because of the low fluorescence signal produced by the complementation fragments 
based on the 144/145 split fragments, only the complementation fragments that were 
based on splits at residues 157/158 er-47-2/473 were transferred to an eukaryotic 
expression system to permit evaluation of fragment complementation in mammalian cells. 

NtermEGFP-NZ fragments in PS1596 and PS1597, and CZ-CtermEGFP fragments in 
20 PS1594 and PS1595, are flanked by an Ncol site 5' to the start codons and a BamHI site 
3' to the stop codons. The fragments were transferred as blunt-ended Ncol/BamHI 
fragments into mammalian expression vectors cut with Eco47lll/BamHI.To select for 
stable expression of both an NtermEGFP-NZ and a CZ-CtermEGFP expressing plasmid, 
the expression vectors for NtermEGFP-NZ fragments and CZ-CtermEGFP fragments 
25 contain selection markers for neomycin/geneticin/G418 and zeocin, respectively. 

Plasmids PS1594, PS1595, PS1596, and PS1597 were cut with Ncol restriction enzyme, 
blunt-ended with Klenow fragment, gel purified, cut with BamHI and gel purified as 
described below. 
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Restriction digestion of NtermEGFP-NZ and CZ-CtermE GFP prokarvotic expression 
• vectors 

PS1594. PS1595, PS1596, or PS1597 (1 ]XQl\i\) 1 ]ii 
Ncol (10 U/nl, from New England Biolabs) 1 \x\ 

10x buffer 4 (NEB) . 3 jxl 

H z O 25 yA 

The plasmids were digested for about 1 hour at 37°C. 1 u.l of 1 mM dNTP mix and 1 unit 
5 of Klenow fragment (New England Biolabs) were added and the reactions were incubated 
30 minutes at room temperature. The linear plasmid fragments were purified by agarose 
gel electrophoresis and recovered from the gel using the QIAquick Gel Extraction kit (spin 
columns) from Qiagen and recovered in 50 p.l of elution buffer. 5 |al BamHI buffer (New 
England Biolabs) and 10 units BamHI enzyme were added. The plasmids were digested 
10 for about 1 hour at 37°C. The desired plasmid fragments were purified by agarose gel 
electrophoresis and recovered from the gel using the QIAquick Gel Extraction kit (spin 
columns) from Qiagen and recovered in 50 |il of elution buffer. 

To stably co-express NtermEGFP-NZ and CZ-CtermEGFP fragments in the same 
mammalian cell, mammalian expression vectors carrying different selection markers were 
" "15 required. To obtain this, the kanamycin/neomycih selection marker on tne expression 

vector pEGFP-C1 was replaced with a zeocin resistance marker resulting in the plasmid 
referred to as PS0609. 



Replacement of kanamycin/neomycin marker on pEGFP-C1 with zeocin marker. 

pEGFP-C1 was digested with Avrll, which excises the kanamycin/neomycin selection 
20 marker, and following gel purification, the vector fragment was ligated with an 

approximately 0.5 kbp Avrll fragment encoding zeocin resistance. This fragment was 
isolated by PCR amplification of the zeocin selection marker on piasmid pZeoSV 
(Invitrogen) using primers 9655 and 9658 (see Table 2). Both primers contain Avrll 
cloning sites and flank the zeocin resistance gene on plasmid pZeoSV including its E. coli 
25 promoter. The top primer 9658 spans the Asel site at the beginning of zeocin, which can 
be used to determine the orientation of the Avrll insert relative to the SV40 promoter 
which drives resistance in mammalian cells. The resulting plasmid is referred to as 
PS0609. 
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Plasmids pEGFP-C1 (Clontech) and its zeocin-resistant derivative PS0609 were cut with 
Eco47III restriction enzyme, gei purified, cut with BamHI and gel purified as described 
below. These steps excise EGFP and leave the rest of the vectors intact. 

Restriction digestion of eukarvotic expression vectors 

pEGFP-C1 or PS0609 DNA (1 \xgl\x\) 0.5 pJ 

Eco47III (10 U/jxl, from Promega) 1 jll! 

1 0x buffer D (Promega) • 3 |ii 

H 2 0 25.5 |nl 

5 



The plasmids were digested for about 1 hour at 37°C. The linear plasmid fragments were 
purified by agarose gel electrophoresis and recovered from the gel using the QIAquick 
Gel Extraction kit (spin columns) from Qiagen and recovered in 50 y\ of elution buffer. 5 jxl 
BamHI buffer (New England Biolabs) and 10 units BamHI enzyme were added. The 
1 0 plasmids were digested for about 1 hour at 37°C. The desired vector fragments were 
purified by agarose gel electrophoresis and recovered from the gel using the QIAquick 
Gel Extraction kit (spin columns) from Qiagen and recovered in 50 pi of elution buffer. 



Ligation of NtermEGFP-NZ fragments into pEGFP-C1 and CZ-CtermE GFP fragments into 
PS0609 

Cut and purified vector fragment 1 \x\ 

Cut and purified NtermEGFP-NZ or CZ-CtermEGFP 3 \x\ 
fragment 

1 0x T4 DNA iigase buffer (New England Biolabs) 1 jil 

T4 DNA Iigase (400 U/jil, New England Biolabs) 0.5 jul 

H20 5 \x\ 

15 

Ligation reactions were incubated at 16°C overnight. 3 \i\ were transformed into One Shot 
TOP10 chemically competent E. coli ceils (Invitrogen) and transformants were selected on 
imMedia with kanamycin or imMedia with zeocin (both from Invitrogen) for pEGFP-C1 and 
PS0609 derivatives, respectively. 

20 4 transformants from each transformation plate were picked in imMedia medium with 
appropriate selection (kanamycin or zeocin) and grown at 37 degrees C for 6 hours. 
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Plasmid DNA was isolated by the QIAprep spin column method (Qiagen) and analysed by 
restriction digests with Ase! and MIul. The DNA sequences of the inserts were finally 
verified by sequencing as described above. The resulting plasmids were named PS1557, 
PS1558, PS1559, and PS1560 (Table 4). 

5 Example 7: EGFP based bimolecular fluorescence complementation in 
mammalian cells 

To establish cells lines that express EGFP fragment/zipper fusion proteins, CHO-hIR cells 
. were transfected with plasmid pairs resulting in two cell lines 1) CHO-hIR 

PS1559+PS1557, and 2) CHO-hIR PS1560+PS1558. The CHO-hIR cell line consists of 

10 CHO-K1 (ATCC CCL-61) cells that have been stably transfected with the human insulin 
receptor ((hIR, GenBank Acc# M10051) as described in: Hansen, B. F., Danielsen, G. M., 
Drejer, K., S0rensen, A. R., Wiberg, F. C, Klein, H. H., Lundemose, A. G. (1996) 
Sustained signalling from the insulin receptor after stimulation with insulin analogues 
exhibiting increased mitogenic potency. Biochem. J. Apr 1; 315 ( Pt 1):271-279)<. The 

1 5 selection marker for the vector is methotrexate (MTX). The hIR expression is very stable 
in the CHO-hIR cells, without selection pressure, because of the insulin-sensitivity of the * 
cell line and a very stable phenotype can be maintained without selection pressure. 

Stable cells were obtained by cell growth in selection medium containing Geneticin and 
Zeocin. 

20 CHO-hIR cells were transfected using Fugene (Roche) according to the manufacturer's 
instructions. The day after transfection, cells were examined for transient expression, split 
1:10 and exposed to selection medium (growth medium supplemented with 500 pg/ml 
geneticin (Invitrogen) and 1 mg/ml zeocin (Cayla). The cells lines were stable after 2-3 
weeks of culture in selection medium. 

25 The growth medium used was NUT.MIX F-12 (Ham's) with GLUTAMAX-1 

(Gibco/lnvitrogen) supplemented with 10% fetal bovine serum (JRH Biosciences) and 1% 
Penicillin-Streptomycin (10,000 lU/ml, Gibco/lnvitrogen). The CHO-hIR cells were cultured 
in growth medium, and split 1 :4 to 1:16 twice a week according to standard cell culture 
protocols. The CHO-hIR PS1559+PS1557 and CHO-hIR PS1560+PS1558 were treated 

30 likewise, except that the growth medium was supplemented with 500 jjg/ml geneticin 
(Invitrogen) and 1 mg/ml zeocin (Cayla) at all times. 
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Images of three CHO-hIR cell lines separately transfected with pEGFP-C1 (expressing 
EGFP with a short C-terminal extension), PS1559 + PS1557 (expressing EGFP 
complementation fragments split at 157-158, NtermEGFPI 57-NZ + CZ-Cterm EGFP 158) 
and with PS1560 + PS1558 (expressing EGFP complementation fragments split at 172- 
5 173, N term EG F P 1 72- NZ + CZ-Cterm EGFP 173) were collected 1 day, 2 days and 10 
days after transfection to assess the relative brightness of cells expressing the different 
complementation constructs. Images were collected on a Nikon Diaphot 300 equipped for 
epifluorescence work: Light source for epifluorescence was a Nikon 100W Hg arc lamp, 
coupled to the microscope through a custom quartz fibre illuminator (TILL Photonics 

10 GmbH, Planegg, Germany). Excitation light passed through a 450-490 nm bandpass filter 
(Delta Light and Optics, Lyngby, Denmark) and was directed to the specimen via a 
Chroma 72100 505 nm cut-on dichroic mirror (Chroma Technology, Brattleboro, VT, 
USA). Ax40 NA1.3 oil immersion lens was used for all images. Emitted light passed 
through a 540-550 bandpass filter (Chroma) to a Hammamatsu Orca ER camera. All 

15 images were collected with 50 millisecond exposure time, chosen to ensure non- 
saturation of images for even the brightest (EGFP-expressing) cells in each optical field 
(maximum pixel count <4095). Imaging software used to acquire images on this system 
was IPLab for Windows (Scanalytics, USA). 

Presentation and analysis of images 

20 The microscope images were analysed using the Imaged software package, the public 
domain image analysis software written by Wayne Rasband of the US National Institute of 
Health (http://rsb.info.nih.gov/ij/) and the data analysis was performed in Microsoft Excel. 
The images shown in Figure 2 are of fluorescent CHO-hIR cells co-transfected with 
different NtermEGFP-NZ and CZ-CtermEGFP expression vectors or transfected with 

25 pEGFP-C1 . The images are scaled individually to visualise the cells and the fluorescence 
distribution within them. Because of this scaling, the relative fluorescence levels cannot be 
compared between the images. When the same images are scaled identically they appear 
as in Figure 3 and it is apparent that the cells that are transfected with complementation 
constructs that are based on a split between residues 172 and 173 are significantly more 

30 fluorescent than the cells that are transfected with complementation constructs that are 
based on a split between residues 157 and 158. However, the cells transfected with the 
pEGFP-C1 construct show significantly stronger fluorescence on day 2. 
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The same images were analysed for background and maximum fluorescence intensities 
using the ImageJ software package (Figure 4). From the figure, it is clear that a split 
between residues 172 and 173, and probably anywhere else in this loop, is greatly 
superior to a split between residues 157 and 158 and probably also to splits anywhere 
5 else in this loop. 

Example 8: Eukaryotic expression vectors encoding EYFP and EYFP variant 
F64L fragment/zipper fusion proteins 

Mutagenesis of the eukaryotic NtermEGFP-NZ expression vectors PS1559 
(NtermEGFPI 57-NZ) and PS1560 (NtermEGFP172-NZ) into the corresponding N- 

10 terminal EYFP (SEQ ID NO: 5) fragment (NtermEYFP-NZ) variants and mutagenesis of 
the eukaryotic CtermEGFP expression vectors PS1557 (CZ-CtermEGFP158) and 
PS1558 (CZ-CtermEGFP173) into the corresponding C-terminal EYFP fragment (CZ- 
CtermEYFP) variants was accomplished by site directed mutagenesis using the 
QuickChange kit and by following the manufacturers instructions (Stratagene). Primers 

15 2333 and 2334 were used to convert expression vectors PS1559 (NtermEGFPI 57-NZ) 

" and PS1 560 (NtermEGFPI 72-NZ) into N-terminal EYFP fragment expression vectors 
PS1 639 (NtermEYFPI 57-NZ) and PS1 642 (NtermEYFPI 72-NZ). The introduced 
mutations were: L64F:T65G:V68L:S72A. Furthermore, primers 2335 and 2336 were used 
to convert expression vectors PS1559 (NtermEGFPI 57-NZ) and PS"1560 

20 (NtermEGFPI 72-NZ) into F64L mutated N-terminal EYFP fragment expression vectors 
PS1640 (NtermE[F64L]YFP1 57-NZ) and PS1641 (NtermE[F64L]YFP1 72-NZ). The 
introduced mutations were: T65G:V68L:S72A. Accordingly, the expressed NtermEYFP 
fragments have the following amino acid sequences (only residues 64-72 are shown): 

71 72 
F S 

F A 

F A 



64 

NtermEGFP L 
(template) 

NtermEYFP F 
(L64F:T65G:V68L:S72A) 

NtermE[F64L]YFP , 
(T65G:V68L:S72A) 



65. 66 67 68 69 70 

T Y G V Q C 

G Y G L Q C 

G Y G L Q C 



25 



Finally, primers 2337 and 2338 were used to convert expression vectors PS1557 (CZ- 
CtermEGFP158) and PS1558 (CZ-CtermEGFP173) into C-terminal EYFP fragment 
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expression vectors PS1637 (CZ-CtermEYFP158) and PS1638 (CZ-CtermEYFP173) by 
introducing a T203Y mutation. All sequences were verified by DNA sequencing of the 
vectors and ail primer sequences are shown in Table 2. 

Example 9: EGFP based bimolecular fluorescence complementation in 
mammalian cells 

The constructed EYFP based split fluorescent protein expression vectors PS1637 to 
PS1642 described above were investigated in mammalian cells in parallel with the EGFP 
based split fluorescent protein expression vectors PS1557 to PS1560 described in 
Example 7 and using the same experimental set-up (including the same filter set) and 
procedures (including the image analysis procedure) except that all images were 
produced using 10 ms exposure times instead of 50 ms exposure times, because of the 
increased brightness of the probes, and a 20x objective was used instead of a 40x 
objective to image more cells. Other appropriate filter sets could have been used. The 
images are taken the day after transfection (day 1). 

1 5 It is apparent from the identically scaled fluorescence images of the transfected oells 
(Figure 6) that the split site between residues 172 and 173 is again shown to be superior 
to the split site between residues 157 and 158. Furthermore, it is apparent that 

■ complementation based on EYFP fragments is superior to complementation based on 
EGFP fragments. Surprisingly, introduction of the F64L mutation from EGFP into the N- 

20 terminal EYFP fragments further greatly enhanced the fluorescence of the complementing 
fragments. As can be seen from the images, the positive effects of using the optimal 
splitting site (between residues 172 and 173) using the optimal fluorescent protein colour 
variant (EYFP) and introducing the F64L folding mutation into the NtermEYFP fragment, 
are additive. Quantification of these observation was done by analysing the images shown 

25 in Figure 6 and the numeric out-put is presented in Figure 7. 
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Effects of colour (yellow better): 



Good 




Better 


EGFP 


vs 


EYFP 


NtermEGFP157-NZ + 


vs 


NtermEYFPI 57-NZ + 


CZ-CtermEGFP158 




CZ-CtermEYFP158 


Nterm EGFP1 72-NZ + 


vs 


Nterm EYFP172-NZ + 


CZ-CtermEGFP173 




CZ-Cterm EYFP 173 



Effects of split site (172/173 better): 



Good - 




Better 


NtermEGFPI 57-NZ + 
CZ-CtermEGFP158 


vs 


Nterm EGFP 1 72-NZ + 
CZ-CtermEGFP173 


NtermEYFPI 57-NZ + 
CZ-Cterm EYFP1 58 


vs 


Nterm EYFP172-NZ + 
CZ-CtermEYFP173 


NtermE[F64L]YFP1 57-NZ + 
CZ-CtermEYFP158 


vs 


NtermE[F64L]YFP1 72-NZ + 
CZ-CtermEYFP172 



5 ■ Effects of F64L (+F64L better)-: 



Good 



Better 



NtermEYFPI 57-NZ + vs NtermE[F64L]YFP1 57-NZ + 
CZ-CtermEYFP158 . . CZ-Cterm EYFP 158. . _ 



Nterm EYFP1 72-NZ + 
CZ-Cterm EYFP1 73 



vs NtermE[F64L]YFP1 72-NZ + 
CZ-CtermEYFP173 



It is interesting to note, that the optimal constructs (NtermE[F64L]YFP1 72-NZ and CZ- 
CtermE[F64L]YFP173) when re-assembled is nearly as intense as EYFP itself. The great 
increase in fluorescence intensity is important in many types of quantitative cell analyses 
10 (e.g. high through-put screening and microscopy) to increase the signal to noise rations, 
to facilitate detection of low amounts of probes in vivo or in vitro, etc. 

Mixing NtermEYFP with CtermEGFP or NtermEGFP with CtermEYFP fragments can also 
produce functional fluorescent complexes, potentially of different colours (Figs. 8 and 9). 
Fragments having overlapping sequences are also functional and may be very attractive 
15 in e.g. functional cloning systems where highly flexible linkers sequences are required due 
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to the very diverse nature of the fusion partners. The overlapping fragments permit either 
of the fusion partners to have a long linker sequence (Figure 8, quantified in Figure 9). 

Example 10: Construction of PS1769,1767,1771,1768 

Plasmid PS1769 encodes a fusion of NtermE[F64L]YFP172 and FKBP, connected by a 
5 linker sequence GSGSGSGDITSLYKKAGST (1 letter amino acid code, SEQ ID NO: 11) 
derived in part from the Gateway recombination sequence. 

Plasmid PS1767 encodes a fusion of NtermE[F64L]YFP1 72 and the FKBP binding part of 
FRAP, FRB (amino acids 2025-21 14 of FRAP), connected by a linker sequence 
GSGSGSGDITSLYKKAGST (1 letter amino acid code, SEQ ID NO: 12) derived in part 
10 from the Gateway recombination sequence. 

Plasmid PS1771 encodes a fusion FRB and CtermEYFP173, connected by a linker 
sequence DPAFLYKWISGSGSGSG (1 letter amino acid code, SEQ ID NO: 13) derived 
in part from the Gateway recombination sequence. 

Plasmid PS1768 encodes a fusion of FKBP and CtermEYFP173, connected by a linker 
15 sequence DPAFLYKWISGSGSGSG (1 letter amino acid code, SEQ ID NO: 14) derived 
in part from the Gateway recombination sequence 

Construction of plasmid PS1769. 

Plasmid PS1769 encodes a fusion of NtermE[F64L]YFP1 72 and FKBP, connected by a 
linker sequence, under the control of a CMV promoter and with kanamycin and neomycin 
20 resistance as selectable marker in E.coli and mammalian cells, respectively. 

Plasmid PS1769 was derived from plasmids PS1779 (entry clone) and PS1679 
(destination vector). Plasmid PS1679 was derived from plasmids PS1672 and pEGFP- 
C1 (Clontech). Plasmid PS1672 was derived from plasmid PS1641 described above. 

Construction of intermediate PS1672. 
25 PS1641 was subjected to PCR with primers 2219 and 2222 (Table 2), and the ca 0.5 kb 
Nhe1-BamH1 fragment was ligated into pEGFP-C1 (Clontech) digested with Nhe1 and 
BamH1 . This replaces NtermEGFP with NtermE[F64L]YFP 1 72 followed by a linker 
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sequence, which encodes in frame linker sequence Gly-Ser-GIy-Ser-Gly-Ser-Giy, and a 
unique EcoRV site just upstream of BamH1 . This plasmid is called PS 1672. 

Construction of destination vector PS1679. 

Plasmid PS1672 was converted into a Gateway compatible destination vector by cutting 
5 the DNA with EcoRV and ligating it with Gateway Cassette reading frame A, following the 
recommendations of "the Gateway manufacturer (Invitrogen). This destination vector is 
called PS1679. 

Construction of Gateway entry clone PS1779. 

The coding sequence of FKBP (GenBank Acc no XMJJ16660) was isolated from human 
1 0 cDNA using PCR and primers 2442 and 1 272 (Table 2), The ca 0.4 kb product was 
transferred by a BP reaction into donor vector pDONR207, following the manufacturers 
recommendations (Invitrogen), to produce entry clone PS1779. 

Finally, the expression vector PS1769 was produced by transferring FKBP from entry 
clone PS1779 with an LR reaction into destination vector PS1679 following the 
15 manufacturers recommendations (Invitrogen). ; 

Construction of plasmid PS1767. 

Plasmid PS 1 767 encodes a fusion of NtermE[F64L]YFP1 72 and the FKBP binding part of 
FRAP, FRB (amino acids 2025-2114 of FRAP), connected by a linker sequence, under 
the control of a CMV promoter and with kanamycin and neomycin resistance as 
20 selectable marker in E.coli and mammalian cells, respectively. 

Plasmid PS1767 was derived from plasmids PS1781 (entry clone) and PS1679 
(destination vector). Plasmid PS 1679 was constructed as described above. 

Construction of Gateway entry clone PS1781. 

The FKBP binding part of FRAP (amino acids 2025-21 14, Gen Bank Acc no XMJ301528) 
25 was isolated from human cDNA using PCR and primers 2444 and 1268 (Table 2). The ca 
0.3 kb product was transferred by a BP reaction into donor vector pDONR207, following 
the manufacturers recommendations (Invitrogen), to produce entry clone PS1781. 
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Finally, the expression vector PS1767 was produced by transferring FRB from entry clone 



PS1781 with an LR reaction into destination vector PS1679 following the manufacturers 
recommendations (Invitrogen). 

Construction ofplasmid PS1771. 

5 Plasmid PS1771 encodes a fusion of the FKBP binding part of FRAP called FRB (amino 
acids 2025-21 14 of FRAP) and the C-terminal of EYFP (FRB-CtermEYFP1 73), connected 
by a linker sequence, under the control of a CMV promoter and with zeocin resistance as 
selectable marker in E.coli and mammalian cells. 

Plasmid PS1771 was derived from plasmids PS1782 (entry clone) and PS1688 
10 (destination vector). Plasmid PS1688 was derived from plasmids PS1674 and PS609 
described above. Plasmid PS1674 was derived from plasmid PS1638 described above. 

Construction of intermediate PS1674. 

P31638 was subjected to PCR with primers 2225 and 2132 (Table 2), and the ca 0.25 kb 
Nhe1-BamH1 fragment was ligated into PS609 digested with Nhe1 and BamH1. This 
15 replaces EGFP with EYFP(1 73-238) preceded by a linker sequence, which encodes in 
frame linker sequence Gly-Ser-Gly-Ser-Gly-Ser-Gly, and a unique EcoRV site just 
downstream of Nhe1. This plasmid is called PS1674. 

Construction of destination vector PS1688. 

Plasmid PS1674 was converted into a Gateway compatible destination vector by cutting 
20 the DNA with EcoRV and ligating it with Gateway. Cassette reading frame A, following the 
recommendations of the Gateway manufacturer (Invitrogen). This destination vector is 
called PS1 688. 

Construction of Gateway entry clone PS1782. 

The FKBP binding part of FRAP (GenBank Acc no XMJ301528, amino acids 2025-21 14) 
25 was isolated from human cDNA using PCR and primers 2444 and 2445 (Table 2). The ca 
0.3 kb product was transferred by a BP reaction into donor vector pDONR207, following 
the manufacturers recommendations (Invitrogen), to produce entry clone PS1782. 
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Finally, the expression vector PS1768 was produced by transferring FRB from entry clone 
PS1782 with an LR reaction into destination vector PS1688 following the manufacturers 
recommendations (Invitrogen). 

Construction ofplasmid PS1768. 
5 Plasmid PS1768 encodes a fusion of FKBP and EYFP(1 73-238) (FKBP-CtermEYFP173), 
under the control of a CMV promoter and with zeocin resistance as selectable marker in 
E.coli and mammalian cells. 

Plasmid PS1768 was derived from plasmids PS1780 (entry clone) and PS1688 
(destination vector). Plasmid PS1688 was constructed as described above. 

1 0 Construction of Gateway entry clone PS1 780. 

• The coding sequence of FKBP (GenBank Acc no XM_01 6660) was isolated from human 
cDNA using PCR and primers 2442 and 2443 (Table 2). The ca 0.4 kb product was 
transferred by a BP reaction into donor vector pDONR207, following the manufacturers 
recommendations (Invitrogen), to produce entry clone PS1 780. 

1 5 Finally, the expression vector PS1768 was produced by transferring FKBP from entry 
clone PS1780 with an LR reaction into destination vector PS1688 following the 
manufacturers recommendations (Invitrogen). 

Example 11: Construction of an inducible interaction system using the GFP 
complementation method that demonstrates utility of the method in 
20 screening for compounds that inhibit protein-protein interactions. 

The immunosuppressive compound rapamycin binds to FK506 binding protein (FKBP) 
and simultaneously to the large PI3Kinase homolog FRAP (also known as mTOR or 
RAFT), and thus serves as an heterodimeriser compound for these two proteins. To use 
rapamycin to induce heterodimers between proteins of interest, one of the proteins is 
25 fused to FKBP domains, and the other to a 90 amino acid portion of FRAP, termed FRB, 
that is sufficient for the binding the FKBP-rapamycin complex (Chen ef a/, PNAS 92, 4947 
(1995)). In this example fusions of FRB and FKBP were made to complementary halves 
of split-EYFP (which included the F64L mutation in the EYFP(1-172) sequence 
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(NtermE[F64L]YFP172)), so that the complementation reaction could be controlled by 
addition of rapamycin. 

This example demonstrates that a model GFP complementation system using 
components which can be made to interact conditionally does respond as expected in a 
5 dose-dependent manner to the interaction stimulus. The example also provides 
information about the rate of fluorescence development for the E[F64L]YFP 
complementation system. Further it demonstrates that the system can be used to detect 
compounds that will block the interaction of proteins fused to the complementary halves 
the E[F64L]YFP complementation system. 

10 The following fusion constructs were made as described in Example 10: 



NtermE[F64L]YFP1 72-FKBP = plasmid code PS1769 
FRB-CtermEYFPl73 = PS1771 
NtermE[F64L]YFP172-FRB = PS1767 
FKBP-CtermEYFP 1 73 = PS1768 

15 Probes were co-transfected in pairs into CHO-hIR cells (supra), PS1769 with PS1771 and 
PS1767 with PS1768, using the transfection agent FuGENE™ 6 (Boehringer Mannheim 
Corp, USA) according to the method recommended by the suppliers. Cells were cultured 
in growth medium (HAM's F12 nutrient mix with Glutamax-1 , 1 0 % foetal bovine serum 
(FBS), 100 p.g penicillin-streptomycin mixture ml" 1 (GibcoBRL, supplied by Life 

20 Technologies, Denmark)). Transfected cells were cultured in this medium, with the 
addition of two selection agents appropriate to the plasmids being used, being 1 mg/ml 
zeocin plus 0.5 mg/mlG418 sulphate. Cells were cultured at 37°C in 100% humidity and 
conditions of normal atmospheric gases supplemented with 5% C0 2 . 

After 10 to 12 days culture in the continuous presence of the selection agents, the 
25 resulting cell lines were judged to be stably transfected. For fluorescence microscopy, 
aliquots of cells were transferred to Lab-Tek chambered cover glasses (Nalge Nunc 
International, Naperville USA) and allowed to adhere for at least 24 hours to reach about 
80% confluence. Images were routinely collected using a Nikon Diaphot 300 inverted 
fluorescence microscope (Nikon Corp., Tokyo, Japan) using x20 (dry) and/or x40 (oil 
30 immersion) objectives and coupled to a Orca ER charged coupled device (CCD) camera 
(Hammamatsu Photonics K.K., Hammamatsu City, Japan). The cells are illuminated with 
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a 1 00 W HBO arc lamp via a 470±20 nm excitation filter, a 51 0 nm dichroic mirror and a 
515+15 nm emission filter for minimal image background. Image collection, subsequent 
measurement and analysis of fluorescence intensity were all controlled by IPLab 
Spectrum for Windows software (Scanalytics, Fairfax, VA USA). 

5 Cells were also grown for 1 6 hours from a seeding density of approximately 1 .0 x 1 0 5 cells 
per 400 ul_ in plastic 96-well plates (Polyfiltronics Packard 96-View Plate or Costar Black 
Plate, clear bottom; both types tissue culture treated) both for imaging purposes and for 
measurements of fluorescence intensity in fluorescence plate readers. Prior to 
experiments, the cells are cultured over night without selection agent(s) in HAM F-12 

10 medium with glutamax, 100 pg penicillin-streptomycin mixture ml' 1 and 10 % FBS. This 
medium has low auto fluorescence enabling fluorescence measurements on cells straight 
from the incubator. For endpoint measurements, cells in plates were routinely fixed with 
4% formaldehyde in phosphate buffered saline (PBS) + 10 pM Hoechst 22538 for 10 
minutes, followed by 3 wash steps using PBS. The use of the nuclear dye Hoechst 22538 

15 enables correction of the EYFP fluorescence signal from each well for cell density. Plates 
prepared in this way were measured on a Fluoroskan Ascent CF plate reader 
(Labsystems, Finland) equipped with appropriate filter sets (EYFP: excitation 485 nm, 
emission 527 nm; Hoechst 22538: excitation 355 nM, emission 460 nm). 

Both cell lines CHO-hIR [PS1769 + PS177TfWCHO-hlR [PS1767 + PS1768] 
20 responded to rapamycin with a substantial increase in EYFP fluorescence after several 
hours incubation, as expected (Figure 10). At the starting condition for these cells (t=0), 
fluorescence is barely visible in most cells, although it was noted that some cells (< 5%) in 
the population had low, but appreciable, fluorescence before treatment (Figure 10a). After 
4 hours (Figure 10b) many cells (approximately 40%) had developed significantly greater 
25 EYFP fluorescence throughout the cytoplasmic and nuclear compartments. After 1 6 hours 
(Figure 10c) the response per cell had increased further and encompassed a larger 
proportion of the cell population (approximately 70%). Results were essentially identical 
for the second cell line CHO-hIR [PS1769 + PS1771]. 

The graph in Figure 11 shows the rate of development of cellular EYFP fluorescence 
30 following rapamycin treatment of the CHO-hIR [PS1767 + PS1768] line. Cells were 
treated in 96-well plates with 3 uM rapamycin and the fluorescence measured at various 
times. Treatment and measurements were made with the cells growing in HAM's medium 
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+ 10% FBS, and fluorescence measurements were corrected for the background 
fluorescence from this medium. The graph demonstrates that the half-time for 
development of fluorescence is approximately 5 hours. The rate of development of 
fluorescence includes time for interaction between FKBP and FRB mediated by the 
5 dimeriser rapamycin, plus the time for annealing of the EYFP moieties, and the 
(presumably much longer) time needed for maturation of the fiuorophore within the 
successfully annealed EYFP protein. 

Figure 12 is a response curve to different rapamycin doses for the CHO-hIR [PS1769 + 
PS1771] cell line. Cells were cultured in 96-well plates, treated with various rapamycin 
1 0 doses for 16 hours, then fixed and stained with Hoechst prior to determination of EYFP 
fluorescence/cell (arbitrary units) on the Ascent plate reader. Values are corrected for 
PBS background as well as cell number. The cell line shows approximately a 3-fold 
increase in the EYFP intensity/cell over the dose range of rapamycin used in this 
experiment. 

1 5 One way to increase the dynamic range of the response, and to decrease the inherent 
EYFP" background signal from these cell lines, is to remove the fraction of .cells that are • 
EYFP bright prior to rapamycin stimulation. This is easily accomplished through 
fluorescence activated cell sorting (FACS) methods. Each of the cell lines were sorted by 
TrTIs method into 3 groups: (i) most green group (ii) medium to low-greerrgroap-and (iii) 

20 black group. The 'most green' was discarded in each case, while the other 2 groups were 
cultured for further use. Figure 13 (a) and Figure 13(b) show the improved response to 
100 nM rapamycin of cell line CHO-hIR [PS1767 + PS1768] after the sorting procedure. 

Figure 14(a) and (b) show the response of the 'medium to low-green' and 'black' FACS 
groups (respectively) derived from the CHO-hIR [PS1767 + PS1768J parent line. Dose 

25 response to rapamycin was measured after 7 hours (a) and 30 hours (b) for each cell line. 
Values for fluorescence have been corrected for plate & medium background. Increase in 
EYFP fluorescence is better than 20-fold the unstimulated value in each case. 
Unexpectedly, the absolute fluorescence signal does not appear to change significantly 
between 7 and 30 hours, although the cells are still alive during this period. Furthermore, 

30 the dose-response curves at 7 and 30 hours for each cell line are very closely similar, with 
an EC50 of approximately 0.25 u.M in the 'medium to low-green' group, and 0.1 pM in the 
'black' group. This data suggest that once the dimerisation has occurred, the EYFP 
complements are stable within the cells for longer than 30 hours. The 'medium to low- 
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green' group has a greater overall response range, reaching intensities of greater than 3- 
fold that of the black group at the highest rapamycin concentration. Both FACS groups 
have significantly lower pre-stimulation fluorescence intensities compared to the parent 
(non-FACS'd) lines. 

5 Figure 1 5(a) and (b) show dose-response competition curves for FK506 versus 1 00 nM 
rapamycin in two of the FACS'd lines, CHO-hIR [PS1768 + PS1767] 'mid to low-green' 
group (Figure 15(a)) and CHO-hIR [PS1769 + PS1771] 'black' group (Figure 15(b)). ECso 
values in both cases are approximately 1.2 ^iM FK506. The cells were incubated 
overnight (16 hours) with mixtures of the two compounds, then fixed and stained with 

1 0 Hoechst prior to detemination of EYFP fluorescence/cell on an Ascent plate reader. Plate 
and solution backgrounds have been subtracted; the dashed lines on each graph indicate 
the prestimulated fluorescence levels for each cell line in these experiments. These 
results indicate that the GFP complementation method employing fusions to 
NtermE[F64L]YFP1 72 and CtermEYFP173 may be used successfully to screen for 

1 5 compounds that interfere with a conditional interaction between two protein components. 
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Figure legends 
Figure 1 

General structures of the fusion protein coding sequences. 
Figure 2 

5 1 6 bit images of fluorescent CHO-hIR cells co-transfected with NtermEGFP-NZ and CZ- 
CtermEGFP expression vectors or transfected with pEGFP-C1 were taken and scaled 
individually to visualise the cells and the fluorescence distribution within them. Because of 
the pixel intensity scaling, the relative fluorescence levels cannot be compared among the 
images. The splitting sites are either between residues 157/158 (top row, plasmids 

10 PS1557 and PS1559) or between residues 172/173 (middle row, plasmids PS1558 and 
PS1560). The EGFP expression vector pEGFP-C1 was transfected into the cells in the 
bottom row. The images were taken 1 day (left column), 2 days (middle column), or 10 
days (right column) after transfection. The images of the cells are representative of the 
• cells that expressed functionally complementing fragments. • 

15 Figure 3 . 

The same 16 bit images of fluorescent CHO-hIR cells co-transfected with NtermEGFP-NZ 
and CZ-CtermEGFP expression vectors or transfected with pEGFP-C1 as shown in 
Figure 2 but the images are now shown with the same intensity scaling to allow 
comparison of fluorescence intensities. The cells that are transfected with 

20 complementation constructs that are based on a split between residues 172 and 173 
(middle row) are clearly more fluorescent than the cells that are transfected with 
complementation constructs that are based on a split between residues 157 and 158 (top 
row). However, the cells transfected with the pEGFP-C1 construct (bottom row) show 
significantly stronger fluorescence at day 2. 

25 Figure 4 

The unmanipulated microscope images shown in Figure 3 were analysed using the 
ImageJ software package and data analysis was performed in Microsoft Excel. For each 
16-bit monochrome IP Lab microscope image, pixel intensity data were produced in 
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ImageJ and exported to an Excel spread-sheet for data analysis. The darkest and 
brightest 0.5% of the pixels were identified in each image and the average intensities of 
these two groups of pixels were calculated. The average intensity of the 0.5% darkest 
pixels was defined as the back ground fluorescence intensity (shown as white bars in the 

5 histogram) and the intensity of the 0.5% brightest pixels was defined as the maximum 
intensity. The difference in intensity between the maximum intensity and the background 
intensity was defined as the response (shown as cross hatched bars in the histogram). 
The sum of the background intensity and the response is equal to the maximum intensity. 
From the figure, it is clear that EGFP based fluorescence complementation using a split 

10 between residues 172 and 173, and probably anywhere else in this loop, is greatly 
superior to EGFP based fluorescence complementation using a split between residues 
157 and 158 and probably also to splits anywhere else in this loop. 

Figure 5 

Positions of appropriate fluorescent protein splitting sites are shown on ribbon and wire 
15 frame representations of GFP. The two representations show the same sites from two 
sides (molecule rotated approximately 1 80 degrees around a vertical axis). 

Figure 6 ". 

Co-transfection of expression vectors expressing EGFP and EYFP based 
complementation fragments as described in Figure 3 to compare the abilities of the 
20 various complementation fragments to combine in cells and produce functional 

complexes. All images are scaled identically to allow direct comparison of fluorescence 
intensities between the images. 

Single transfections with N-terminal fragments only resulted in no detectable fluorescence 
above the background level (data not shown). These N-terminal fragments contain amino 
25 acid residues 65-67 forming the chromophore in full-length GFP. 
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Figure 7 

Quantitative analysis of the images shown in Figure 6. The results are in accord with the 
impressions from visual inspection of the cells. The data were produced as described in 
the legend to Figure 4. 



5 Figure 8 

Co-transfection of expression vectors expressing EGFP and EYFP based 
complementation fragments as described in Figure 3 to compare the effects of mixing 
differently colored EGFP, EYFP and EYFP F64L fragments and to determine the 
influence of overlapping fragments, e.g. combining fragments encoding residues 1-172 

10 and 1 58-238. All color combinations complement but typically less efficiently than in the 
correct combinations, i.e. when no or few residues overlap. Fragments having overlapping 
regions are also functional and this may be advantageous in experiments where longer 
linker sequences are or may be required by the fusion partners due to steric hindrance. 
This was not the case in this experiments where the fusion partners are leucine zippers. 

15 In the example (middle column), residues 158-172 were present in both fragments. In all 
situations, the F64L has a favorable effect on the fluorescence intensities. All images are 
scaled identically to allow direct comparison of fluorescence intensities between the 
images. • - 



Figure 9 

20 Quantitative analysis of the images shown in Figure 8. The results can be compared 
directly with the results shown in Figure 7 and they are in accord with the impressions 
from visual inspection of the cells. The data were produced as described in the legend 
Figure 4. 



Figure 10 

25 CHO-hIR [PS1767 + PS1768] cells at 3 time points after treatment with 1 jiM rapamycin. 
Note that image (c) was taken at 25 msec exposure, the previous 2 images at exposures 
of 1 00 msec each. 
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(a) is the starting condition for these cells (t=0), and fluorescence is barely visible in most 
cells, although it was noted that some cells (< 5%) in the population had low, but 
appreciable, fluorescence before treatment. 

(b) , after 4 hours many cells (approximately 40%) had developed significantly greater 
5 EYFP fluorescence throughout the cytoplasmic and nuclear compartments. 

(c) after 16 hours the response per cell had increased further and encompassed a larger 
proportion of the cell population (approximately 70%). 

Figure 11 

The rate of development of cellular EYFP fluorescence following rapamycin treatment of 
10 the CHO-hIR [PS1767 + PS1768] line. Celts were treated in 96-well plates with 3 ^iM 
rapamycin and the fluorescence measured. Treatment and measurements were made 
with the cells growing in HAM's medium + 10% FBS, and fluorescence measurements 
were corrected for the background fluorescence from this medium. The graph 
demonstrates that the half-time for development of fluorescence is approximately 5 hours. 
15 Values corrected for HAM's background, each value a mean + sd for 8 measurements. 

Figure 12 

Response curve to different rapamycin doses for the CHO-hIR [PS1769 + PS1771] cell 
line. Cells were cultured in 96-weiI plates, treated with various rapamycin doses for 16 
hours, then fixed and stained with Hoechst prior to determination of EYFP 
20 fluorescence/cell (arbitrary units) on the Ascent plate reader. Values are corrected for 
PBS background as well as ceil number. The cell line shows approximately a 3-fold 
increase in the EYFP intensity/cell over the dose range of rapamycin used in this 
experiment. 

Figure 13 

25 Each of the cell lines were fluorescence activated cell sorted (FACS) into 3 groups: (i) 
most green group (ii) medium to low-green group and (Hi) black group. The 'most green' 
was discarded in each case, while the other 2 groups were cultured for further use. 

A: CHO-hIR [ps1768 + ps1767] FACS group 'Black' before stimulation (i), and after 16 
hours stimulation with 100 nM rapamycin (ii) & (iii). Images (i) and (ii) were exposed for 
30 1 00 msec, image (iii) for 25 msec. 
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B: CHO-hlR [ps1768 + ps1767] FACS group 'medium-low green' before stimulation (i), 
and after 16 hours stimulation with 100 nM rapamycin (ii) & (Hi). Images (i) and (ii) were 
exposed for 100 msec, image (iii) for 25 msec. 



Figure 14 

5 Show the response of the 'medium to low-green 1 , (a) and 'black' (b) FACS groups 

(respectively) derived from the CHO-hlR [PS1767 + PS1768] parent line (see Figure 13). 
Dose response to rapamycin was measured after 7 hours (i) and 30 hours (ii) for each cell 
line. Values for fluorescence have been corrected for plate & medium background. 
Increase in EYFP fluorescence is better than 20-fold the unstimulated value in each case. 



10 Figure 15 

Show dose-response competition curves for FK506 versus 100 nM rapamycin in two of 
the FACS'd lines, (a) CHO-hlR [PS1768 + PS1767] 'mid to low-green' group, and (b) 
CHO-hlR [PS176S + PS1771] 'black' group. EC 50 values in both cases are approximately 
1 .2 pM FK506. The cells were incubated overnight (16 hours) with mixtures of the two 
15 compounds, then fixed and stained with Hoechst prior to detemination of EYFP 

fluorescence/cell on an Ascent plate reader. Plate and solution backgrounds have been 
subtracted; the dashed lines on each graph indicate the prestimulated fluorescence levels 
for each cell line in these experiments. 



Figure 16 

20 Alignment of fluorescent proteins. 
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Tables 

Table 2 Oiigo nucleotides used in cloning. Oligo nucleotides beginning with P* are 
phosphorylated at the 5' end to permit ligation. 



Oligo 


Oligo nucleotide sequence (5 1 end to 3' end) 


SEQ 

IU 

NO: 


nucleo . 
-uue 






_TTR9- GGTACTGCTTTGAGATTCGTCGG 


15 




a— rp-Q n _ aTrATTGGAGTTTTAGAAGCTC 


16 




PUG A P A ATPTGTGTGGGCACTCGACCGG 

UHVJfiv<rtfi A L> I \ J J. VJ 1 VJ wiw j, w — * i— w'-x 


17 


91 1 n 
Z 1 IU 


P * C ATGGG CGGTGGTACCGGTTC CGGTGCCCTGAAGAAGGAGCTGC AGG 


18 


Ol A 1 


p * AGrTGGTTCTTCAGGGCACCGGAACCGGTACCACCGGC 


19 


9119 


P * CCAACAAGAAGGAGCTGGCCCAGCTGAAGTGGGAGCTGCAG 


20 


01 1 *3 


p * CTCCCACTTCAGCTGGGCCAGCTCCTTCTTGTTGGCCTGC 


21 


91 1 A 
_1 l*t 


P * GGCCTG AAGAAGGAGCTGGCCCAGTAG 


22 


91 1 ^ 


P * GATCCTACTGGGCCAGCTCCTTCTTCAGGGC CTGCAG 


23 


91 1 R 
_£1 ID 


P * CATGGCCAGCGAGCAGCTGGAGAAGAAGCTGCAGGCCCTG 


24 


91 17 


p * n GTGG AGCTTCTTCTCCAGCTGCTCGCTGGC 


25 


«Ti 4 Q 

_d1 1o 


p * G a a A AGAA GGTGGCCCAGCTGGAGTGGAAGAACCAGGCCCTGGAG 


26 




p * G- P r TGGTTCTTCCACTCCAGCTGGGCCAGCTTCTTCTCCAGGG . 


27 




p * a a a A A GG TGGGCGAG GGCGGCACCGGTTAG 


28 


2121 


■n *r« 7\ -nr»r"— *7i a r 1 r , f2r ,r rrzGGGGGGTGGGCCAGCTTCTTCTCCAG 


29 


2128 




30 


2129 


GCCGGAGCCjr\jlAL-vwAt-__ X I (a liiL 1 ^UHVJU x xvjx^j 


31 


2130 


GCCGGACCGGTACCACCCTGCTTGTCGGCCATG 


32 


2131 


GG GGGACCGGTAC CAC C CTCGATGTTGTGGCGGATC 


33 


2132 


CCCCGGATCCTACTTGTACAGCTCGTCCATGC 


34 


2133 


GGCGCCATGGGCACCGGTTACAACAGCCACAACGTC 


35 


2134 


GGCGCCATGGGCACCGGTAAGAACGGCATCAAGGTG 


36 


2135 


GGCGCCATGGGCACCGGTGACGGCAGCGTGCAGCTC 


37 


2219 


GGGGGCTAGCGCCACCATGGTGAGCAAGGGCGAG 


38 


2222 


GCGGGGGATCCGATATCGCCAGAGCCAGAGCCAGAGCCCTCGATGTTGTGGCGGATC 


39 


2225 


GGGGGCTAGCGATATCCGGCTCTGGCTCTGGCTCTGGCGACGGCAGCGTGCAGCTC 


40 


2333 


GCCCACCCTCGTGACC^CCTTCGGCTACGGCCTGCAGTGCTTCGCCCGCTACCCCGACC 
ACATG 


41 


2334 


CATGTGGTCGGGGTAGCGGGCGAAGCACTGCAGGCCGTAGCCGAAGGTGGTCACGAGGG 
TGGGC 


42 


2335 


GCCCACCCTCGTGACCACCCTGGGCTACGGCCTGC^GXGCx^CGCCCGCTACCCCGA^ 
ACATG 


43 


2336 


CATGTGGTCGGGGXAGCGGGCGAAGCACTGCAGGCCGTAGCCCAG 
TGGGC 


44 
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Oligo 
nucleo 
-II ue 


. Oligo nucleotide sequence (5' end to 3' end) 


SEQ 

ID 

NO: 


0007 

2337 




45 


2338 


GCTCAGGGCGGACTGGTAGCTCAGGTAGTGGTTGTC 


46 


2442 


ATTB1- CCACCATGGGAGTGCAGGTGGAAACC 


47 


2443 


ATTB2- CTTCCAGTTTTAGAAGCTC 


A O 
*fc O 


2444 


ATTB1- CCACCATGGAGATGTGGCATGAAGGCCTG 


49 


2445 


ATTB2- C CTGCTTTGAGATTCGTCGGAAC AC 


50 


9655 


TCCTAGGTCAGTCCTGCTCCTCGGCCACGAAGTGCAC 
TCCTAGGCTGCAGCACGTGTTGACAATTAATCATCGG 


51 


9658 


CAGACAATCTGTGTGGGCACTCGACCGG 


52 



Table 3 Primer pairs used in EGFP fragment amplificatio n 
Protein encoded by PCR fragment 5' primer 3 1 primer 



EGFP(1-144) 


2128 


2129 




2128 




EGFP(1-172) 


2128 


2131 


EGFP(145-238) 


2133 


2132 


EGFP(1 58-238) 


2134 


2132 


EGFP(1 73-238) 


2135 


2132 
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Table 4 Cloning and expression vectors 



Vector 


express ea protein 


Promoter 


Selection 








E . col i/m am m. 


pEGFP-C1 


EGFP 




Wan/nPO 

l\CU 1/ 1 icu 


PS0609 


EGFP 


ry\/ 


7pn/zeo 


pTrcHis-A 


no insert 


i rc 


amn/nnnp 

Cll 1 I jJ/ 1 IV-M ic? 


PS1515 


NZ leucine zipper 


Trr 


amn/nnne 


PS1516 


CZ leucine zipper 


"Trc 


amn«/nnne 

CM 1 1 yJI 1 iwi • 


PS1614 


Nterm EG r P 1 44-NZ. 


Trr 


pmn/nnne 


PS1596 


NtermEGFPI 57-NZ. 


Trr 


atnn/nori6 


PS 1597 


NtermEGrPI /z-nz. 


Trr 


amo/none 


PS1615 


CZ-Cterm EG r P 1 4o 


Trr 


amn/none 

Cll 1 Ij-^' 1 ' ul 


PS 1594 


CZ-Cterm EGrF 1 oo 


Trr 


amn/nons 


PS1595 


CZ-Cterm EG r Pi /o 


Trr 


amn/none 

cii i if-" i ' 


PS 1559 


Nterm EG rMio/ -inz. 


CMV 

wlvl v 


kan/neo 


PS1560 


NtermEGrPI / z-NZ. 


PMV 


kan/neo 


PS1557 


CZ-Cterm EG FP 1 58 


OIVI V 


7po/7PO 


* PS 1558 


CZ-CtermEGFPl 7o 




7P0/760 


PS 1639 


Nterm EYr P i o f-Nz. 


CMV 


kan/neo 


PS 1642 


Nterm EYr P 1 7Z-NZ. 


PMV 

wivi v 


kan/neo 


PS1640 


/k ii r— rf^"^vll 1H C7VCD M"7 

(NtermE[F64L]1 57 Y rr-NZ 


pmv 


kan/neo 

rvcii i/ 1 i v>v^ 


PS1641 


Nterm E[F64L] YF P 1 72-NZ 


CMV 


Kan/neo 


PS1637 


CZ-Cterm EYFP1 58 


CMV 


zeo/zeo 


PS1638 


CZ-Cterm EYFP 173 


CMV 


zeo/zeo 


PS1769 


Nterm E[F64L]YFP1 72-FKBP 


CMV 


kan/neo 


PS1767 


NtermE[F64L]YFP172-FRB 


CMV 


kan/neo 


PS1771 


FRB-CtermEYPF1 73 


CMV 


zeo/zeo 


PS1768 


FKBP-CtermEYFP1 73 


CMV 


zeo/zeo 



Table 5 Sequence names and numbers 



SEQ ID NO: Name 



1 


Amino acid sequence of GFP 


2 


Amino acid sequence of GFP Y66W 


3 


Amino acid sequence of GFP Y66H 


4 


Amino acid sequence of EGFP 


5 


Amino acid sequence of EYFP 


6 


Amino acid sequence of EYFP F64L variant 
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7 


Nucleic acid sequence of NZ 


8 


Amino acid sequence of NZ 


9 


Nucleic acid sequence of CZ 


10 


Amino acid sequence of CZ 


11 


NtermE[F64L]YFP1 72 and FKBP linker sequence 


12 


NtermE[F64L]YFP172 and FRB linker sequence 


13 


FRB and CtermEYFP173 linker sequence 


14 


FKBP and CtermEYFP173 linker sequence 


15-52 


Primer sequence (see Table 2) 



All cited patens, publications, copending applications, and provisional applications 
referred to in this application are herein incorporated by reference. 

The invention being thus described, it will be obvious that the same may be varied in 
5 many ways. Such avariations are not to be regarded as a departure from the spirit and 

c/«nrxo nf IKo nraconl ir*worvKor*Q sanrl all ci i/^h mr\r\ ifir»atinriQ c,q wni llrl he nhv/iQIJS tQ QH€?- 

skilled in the art are intended to be included within the scope of the following claims. 
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Claims 

1 . Two GFP fragments comprising 

(a) an N-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
amino acid number 1 to amino acid number X of GFP, wherein the peptide bond between 

5 amino acid number X and amino acid number X+1 is within a loop of GFP and 

(b) a C-terfninal fragment of GFP, comprising a continuous stretch of amino acids from 
amino acid number X+1 to amino acid number 238 of GFP. 

2. Two GFP fragments comprising 

(a) an N-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
10 amino acid number 1 to amino acid number X of GFP, wherein the peptide bond between 

amino acid number X and amino acid number X+1 is within a loop of GFP and 

(b) a C-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
amino acid number Y+1 to amino acid number 238 of GFP, wherein Y<X creating an 
overlap of the two GFP fragments, and wherein the peptide bond between amino acid Y 

15 and amino acid Y+1 is within a loop of GFP. 

3. Two GFP fragments according to any of the preceding claims, wherein GFP is selected 
from the group consisting of EGFP, EYFP, ECFP, dsRed and Renilla GFP. 

4. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EGFP. 

20 5. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EYFP. 

6. Two GFP fragments according to any of the preceding claims, wherein the amino acid 
in position 1 preceding the chromophore has been mutated to provide an increase of 
fluorescence intensity. 

25 7. Two GFP fragments according to the preceding claim, wherein the amino acid F in 
position 1 preceding the chromophore has been substituted by L. 

8. Two GFP fragments according to any of the preceding claims, wherein the GFP has 
been mutated to further contain the S72A mutation. 
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9. Two GFP fragments according to any of the preceding claims, wherein X is between 9 
and 1 0 within the Thr9-Val1 1 loop; or between 23 and 24 within the Asn23-His25 loop; or 
between 38 and 39 within the Thr38-Gly40 loop; or between 48 and 55 within the Cys48- 
Pro56 loop; or between 72 and 75 within the Ser72-Asp76 loop; or between 81 and 82 

5 within the His81-Phe83 loop; or between 88 and 89 within the Met88-Glu90 loop; between 
101 and 102 within the Lys1 01 -Asp1 03 loop; or between 114 and 117withinthe Phe114- 
Thr1 1 8 loop; or between 1 28 and 1 44 within the lie 1 28-Tyr1 45 loop; or between 1 54 and 
159 within the Ala154-Gly160 loop; or between 171 and 174 within the Ile171-Ser175 
loop; or between 188 and 196 within the Ile188-Asp197 loop; or between 210 and 214 

1 0 within the Asp21 0-Art21 5 loop. 

10. Two GFP fragments according to the preceding claim, wherein X is between 154 and 
159 within the Ala154-Gly160 loop. 

1 1 . Two GFP fragments according to the preceding claim, wherein X is 1 57 within the 
Ala154-Gly160 loop. 

15 12. Two GFP fragments according to the preceding claim, wherein X is between 171 and 
174 within the Ile171-Ser175 loop. 

-13. Two GFP fragments according to any of the preceding claims, wherein X is 172 within 
in He171-Ser175 loop. 

14. Two GFP fragments according to the preceding claim, wherein Y is between 154 and 
20 1 59 within the Ala1 54-Gly1 60 loop. 

15. Two GFP fragments according to the preceding claim, wherein Y is 157 within the 
Ala154-Gly160 loop. 

16. Two GFP fragments according to any of the preceding claims, wherein X is 172 within 
in Ile171-Ser175 loop and wherein Y is 157 within the Ala154-Gly160 loop. 

25 17. Two GFP fragments according to any of the preceding claims, wherein the N-terminal 
fragment of GFP is fused in frame with a first protein of interest. 



ru/UK U2/UUb«2 
52 

18. Two GFP fragments according to any of the preceding claims, wherein the first protein 
of interest is fused to the N-terminal of the N-terminal fragment of GFP 

19. Two GFP fragments according to any of the preceding claims, wherein the first protein 
of interest is fused to the C-terminal of the N-terminal fragment of GFP. 

5 20. Two GFP fragments according to any of the preceding claims, wherein the C-terminal 
fragment of GFP is fused in frame with a second protein of interest. 

21 . Two GFP fragments according to any of the preceding claims, wherein the second 
protein of interest is fused to the N-terminal of the C-terminal fragment of GFP. 

22. Two GFP fragments according to any of the preceding claims, wherein the second 
1 0 protein of interest is fused" to the C-terminal of the C-terminal fragment of GFP. 

23. Two GFP fragments according to any of the preceding claims, wherein the N-terminal 
fragment of GFP fused in frame to a first protein of interest further comprises a linker 
sequence between the N-terminal fragment of GFP and -the first protein of interest. 

24. Two GFP fragments according to any of the preceding claims, wherein the C-terminal 
15 - fragment of GFP fused in frame to a second protein of interest further compr-ises-a linker 

sequence between the C-terminal fragment of GFP and the second protein of interest 

25. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EYFP further containing an F64L mutation, wherein X is 1 72, wherein the first protein of 
interest fused to the N-terminal fragment of GFP is fused to the C-terminal of the N- 

20 terminal fragment of GFP and wherein the second protein of interest fused to the C- 
terminal fragment of GFP is fused to the N-terminal of the C-terminal fragment of GFP. 

26. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EYFP further containing an F64L mutation, wherein X is 157, wherein the first protein of 
interest fused to the N-terminal fragment of GFP is fused to the C-terminal of the N- 

25 terminal fragment of GFP and wherein the second protein of interest fused to the C- 
terminal fragment of GFP is fused to the N-terminal of the C-terminal fragment of GFP. 
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28. The Oterminal fragment of GFP according to any of the preceding claims, 

29. Nucleic acid encoding a fragment according to any of the preceding claims. 

30. A cell comprising an N-terminal fragment of GFP according td-any of the preceding 
claims. 

5 31, A cell comprising a Oterminal fragment of GFP according to any of the preceding 
claims. 

32. A cell comprising the two GFP fragments according to any of the preceding claims. 

33. A vector comprising the two GFP fragments according to any of the preceding claims. 

34. A vector comprising the N-terminal fragment of GFP according to any of the preceding 



. ,35. A vector. comprising the Oterminal fragment of GFP according to any. of the pcepeding 
claims. 

36. A plasmid comprising thatwo_G£P fragments according to any of the preceding., 
claims. 

15 37. A plasmid comprising the N-terminai fragment of GFP according to any of the 
preceding claims. 

38. A plasmid comprising the Oterminal fragment of GFP according to any of the 
preceding claims. 

39. A method for detecting the interaction between two proteins of interest comprising the 
20 steps of: 

(a) providing at least one cell that contains two heterologous conjugates, 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP according to any of the preceding claims, 
the second heterologous conjugate comprising a second protein of interest conjugated 
25 to a Oterminal fragment of GFP according to any of the preceding claims; and 



10 claims. 
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(b) measuring the fluorescence from the at least one cell, 

fluorescent cells indicating interaction between the two proteins of interest 

40. A method for monitoring the interaction between two proteins of interest comprising 
the steps of: 

5 (a) providing at least one cell containing at least one stretch of nucleic acid encoding for 
two heterologous conjugates, 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP according to any of the preceding claims, 
the second heterologous conjugate comprising a second protein of interest conjugated 
10 to a C-terminal fragment of GFP according to any of the preceding claims; 

(b) culturing the at least one cell under conditions allowing expression; and 

(c) measuring the fluorescence from the at least one cell, 

fluorescent cells indicating interaction between the two proteins of interest. 

41 . A method according to any of the preceding claims for detecting new interaction 

15 partners, wherein one of the proteins of interest is known, and the other protein of interest 
is an unknown protein comprising the additional step of 

- parallel transfection of the cells with both heterologous conjugates, 

cells expressing interaction partners to the know protein of interes t will be fluorescent and 
thereby easily detectable. 

20 42. A method according to any of the preceding claims for detecting new interaction 

partners, wherein one of the proteins of interest is known, and the other protein of interest 
is an unknown protein comprising the additional steps of 

- establishing a cell line that stabilly expresses the heterologous conjugate comprising the 

known protein of interest; 

25 - transfecting said cell line with a library of heterologous conjugates comprising the 
potential interaction partners; 

cells expressing interaction partners to the know protein of interest will be fluorescent and 
thereby easily detectable. 



30 



43. A method for detecting compounds that induce interaction between two proteins of 
interest comprising the steps of: 
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(a) providing at least one cell that contains two heterologous conjugates, 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminai fragment of GFP as described above, 

the second heterologous conjugate comprising a second protein of interest conjugated 
5 to a C-terminal fragment of GFP as described above; and 

(b) measuring the fluorescence from the at least one cell of step (a), 

(c) apply a test compound to the at least one cell of step (b) 

(d) measuring the fluorescence from the at least one cell of step (c); 

an increase in fluorescence observed from step (b) to step (d) indicating that the test 
10 compound added in step (c) is capable of inducing interaction between the two proteins of 



44. A method for screening for compounds that interfere with a conditional interaction 
between two protein components comprising the steps of: 

(a) providing at least one cell that contains two heterologous conjugates, 

15 the first heterologous conjugate comprising a first protein of interest conjugated to an 

N-terminal fragment of GFP as described above, . 

the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP as described above; and 

(b) measuring the fluorescence from the at least one cell of step (a), . — - 
20 (c) apply a test compound and a compound that induces interaction between two proteins 

of interest to the at least one cell of step (b) 
(d) measuring the fluorescence from the at least one cell of step (c); 
an increase in fluorescence observed from step (b) to step (d) indicating that the test 
•compound added in step (c) does not prevent interaction between the two proteins of 
25 interest; whereas a lesser increase in fluorescence observed from step (b) to step (d) 
indicates that the test compound will interfere with the induced interaction between the 
two proteins of interest. 

45. A method according to any of the preceding claims wherein the at least one cell is a 
heterogeneous cell population, comprising the additional steps of 

30 - removal of the most green cells; 
- removal of the black cells; 
hereby obtaining "medium to low-green" cells. 



interest. 
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46. A method according to the preceding claim, wherein the removal steps are carried out 
by FACS. 

47. A method according to any of the preceding claims wherein the at least one cell is a 
heterogeneous cell population with a high dynamic range, comprising the additional steps 

5 of: 

- stimulating the "medium to low-green" cells with a compound that induces interaction 

between two proteins of interest and; 

- allow sufficient time to pass to let the proteins interact and the fluorescent protein 

fragments fold and become fluorescent; 
10 -'isolate the most green cells; 

this population of cells will have a very low background and still be capable of forming the 
fluorescent protein upon interaction between the two proteins of interest. 

48. A method according to the preceding claim, wherein the isolation step is carried out by 
FACS. 



15 



49. A method according to any of the preceding claims, wherein the at least one cell is a 
mammalian cell. 
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57 

ABSTRACT 

Fluorescence complementation products with intensity levels mimicking the full length 
intensities are obtained by introduction of improved folding capabilities with a mutation in 
position 1 preceding the chromophore. This is particularly seen with the yellow variant of 
5 Green Fluorescent Protein (GFP). An additive increase is obtained by splitting the GFP 
between amino acids 172 and 173. Screening for drugs capable of preventing interaction 
between proteins is performed by selecting the cells with the highest dynamic range 
through Fluorescence Activated Cell Sorting (FACS), as illustrated with the ability of 
FK506 to break the rapamycin induced interaction between FRB and FKBP. 
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BRIGHTNESS OF SPLIT EGFP VARIANTS 
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NtermEGFPI 57-NZ NtermEYFPI 57-NZ NtermE[F64L]YFP1 57-NZ 
+ CZ-CtermEGFP158 + CZ-CtermEYFP158 CZ-Cterm EYFP 158 




EYFP NtermEGFPI 72-NZ NtermEYFPI 72- NZ NtermE[F64L]YFP1 72-NZ 

+ CZ-CtermEGFP173 + CZ-CtermEYFP173 + CZ-Cterm EYFP 173 
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Fig. 8 



NtermEYFP 1 57-NZ 
+ CZ-CtermEGFP158 
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NtermEYFP 172-NZ 
+ CZ-CtermEYFP158 



NtermEYFP 172-NZ 
+ CZ-CtermEGFP173 





NtermE[F64L]YFP1 57-NZ NtermE[F64L]YFP1 72-NZ NtermE[F64L]YFP 172-NZ 
+ CZ-CtermEGFP158 + CZ-CtermEYFP158 + CZ-CtermEGFP173 
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Fig. 16 16/17 

P42212 avGFP MSKGEELFTGWPILVELDGDV 22 

AY015996 rmGFP MSKQILKNTCLQEVMSYKVNLEGIV 25 

AF372525 rrGFP MD LAKLGLKEVMPTKINLEGLV ' 22 

AF168419 dsRed MRS SKNVI KEFMRFKVRMEGTV 22 

AF322222 asCP5 62 MASFLKKTMPFKTTIEGTV 19 

AF168422 asCP562 MAQSKHGLTKEMTMKYRMEGCV 22 

AF246709 asFP595 MASFLKKTMPFKTTIEGTV 19 

AF322221 asFP499 MYPSIKETMRVQLSMEGSV 19 

AF3 84683 mcGFP . MS VI KPIMEI KLRMQGW 18 

AF401282 mfGFP MS V I KPDMKI KLRMEGAV 18 

AF168424 csFP484 MKCKFVFCLS FLVLAITNANT FLRNEADLEEKTLRI PKALTTMGVI KPDMKI KLKMEGNV 60 

AF168420 dsFP483 MS CSKSVI KEEMLIDLHLEGTF 22 

AY015995 spGFP MNRNVLKNTGLKEIMSAKASVEGIV 25 

AF168423 zsFP53 8 MAHS KKGL KEEMTMKYHME GC V 22 . 

AF168421 amFP486 MALSNKFIGDDMKMTYHMDGCV 22 

AY013 824 amGFPxm MSKGEELFTGIVPVLIELDGDV 22 

P4 2 2 1 2 avGFP NGHKF S VS GEGE GDAT YGKL - - TLKF I CTTG- KLPVPWPTLVTTFS YGVQCFSRYPDHMK 7 9 

AY015996 rmGFP NNHVFTMEGCGKGMILFGNQ- -LVQIRVTKGAPLPFAFDIVSPAFQYGNRTFTKYPNDIS 83 

AF3 72525 rrGFP GDHAFSMEGVGEGNI LEGTQ - - EVKI S VTKGAPLP FAFD I VS VAFS YGNRAYTGYPEE I S 80 

AF168419 dsRED NGHEFEIEGEGEGRPYEGHN- -TVKLKVTKGGPLPFAWDILSPQFQYGSKVYVKHPADIP 80 

AF322222 asCP562 NGHYFKCTGKGEGNPFEGTQ - - EMKIEVIEGGPLPFAFHILSTSCMYGSKTFI KYVSGI P 77 

AF168422 asCP562 DGHKFVITGEGIGYPFKGKQ- -AINLCVVEGGPLPFAEDILSAAFNYGNRVFTEYPQDIV 80 

AF246709 asFP595 NGHYFKCTGKGEGNPFEGTQ- -EMKIEVIEGGPLPFAFHILSTSCMYGSKTFI KYVSGIP 77 

AF322221 asFP499 NYHAFKCTGKGEGKPYEGTQ- -SLNITI TEGGPLPFAFDILSHAFQYGIKVFAKYPKEIP 77 

AF384683 mcGFP NGHKWIKGEGEGKPFEGTQ--TINLTVKEGAPLPFAYDIM 76 

AF401282 mfGFP NGHKFVIEGDGKGKPFEGTQ - - SMDLTVKEGAPLPFAYD I LTTVFDYGNRVFAKYPQDI P 76 

AF168424 CSFP484 NGHAFVIEGEGEGKPYDGTH- - TIiNLEVKEGAPLPFSYDIIiSNAFQYGNRADTKYPDDIA 118 

AF168420 dsFP483 NGHYFEIKGKGKGQPNEGTN- - TVTLEVTKGGPLPFGWHILCPQFQYGNKAFVHHPDNIH 80 

A¥0l5995 spGFP iSa^v FSMEGFGKGjsTvLFGNQ - - LMQ IRVTKGGPLPFAFD IV S I AFQ YGNRTFTKx PDD I A 83 

AF168423 ZSFP538 NGHKFVITGEGIGYPFKGKQ--TINLCVIEGGPLPFSEDILSAGFKYGDRIFTEYPQDIV 80 

AF168421 amFP486 NGHYFTVKGEGNGKPYEGTQTSTFKVTMANC^PLAFS 82 

AY013 824 amGFPXM HGHKFSVRGEGEGDAD YGKL - - EI KF I CTTG - KLPVP WPTLVTTFS YGI QCFARYPEHMK 79" 

P42212 avGFP QHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEV- -KFEG DTLVNRIELKGIDFKEDG 134 

AY015996 rmGFP --DYFIQSFPAGFMYERTLRYEDGGLVEIRSDI--NLIE DKFVYRVEYKGSNFPDDG 136 

AF372525 rrGFP - -DYFLQSFPEGFTYERNIRYQDGGTAIVKSDI - - SLED GKFIVNVDFKAKDLRRMG 133 

AF168419 dsRED - - DYKKLS FPEGFKWERVMNFEDGGWTVTQDS - - SLQD GCFIYKVKFIGVNFPSDG 133 

AF322222 asCP562 - -DYFKQSFPEGFTWERTTTYEDGGFLTAHQDT- -SLDG- - 7DCLVYKVKILGNNFPADG 130 

AF168422 asCP562 --DYFKNSCPAGYTWDRSFLFEDGAVCICNADITVSVEEN CMYHESKFYGVNFPADG 135 

AF246709 asFP595 - -DYFKQSFPEGFTWERTTTYEDGGFLTAHQDT- - SLDG DCLVYKVKILGNNFPADG 130 

AF322221 asFP499 - -DFFKQSLPGGFSWERVSTYEDGGVLSATQET- -SLQG DCI I CKVKVLGTNFPANG 130 

AF384683 mcGFP - -DYFKQTFPEGYSWERIMAYEDQSICTATSDI - - KMEG DCFIYEIQFHGVNFPPNG 129 

AF401282 mfGFP - -DYFKQTFPEGYSWERSMTYEDQGI CVATNDI - -TIiMKGVDDCFVYKIRFDGVNFPAMG 132 

AF168424 CSFP484 --DYFKQSFPEGYSWERTMTFEDKGIVKVKSDI — SMEE DSFIYEIRFDGMNFPPNG 171 

AF168420 dsFP483 - -DYLKLSFPEGYTWERSMHFEDGGLCCITNDI- -SLTG NCFYYDIKFTGLNFPPNG 133 

AY015995 spGFP - -DYFVQSFPAGFFYERNLRFEDGAIVDIRSDI - - SLED DKFHYKVEYRGNGFPSNG 136 

AF168423 ZSFP538 - - D YFKNS CPAGYTWGRSFLFEDGAVCI CNVD I TVSVKEN CIYHKS I FNGMNFPADG 135 

AF168421 amFP486 - -DYFKQAFPDGMS YERTFTYEDGGVATASWEI - - SLKGN CFEHKSTFHGVNFPADG 135 

AY013824 amGFPXM MNDFFKSAMPEGYIQERTIFFQDDGKYKTRGEV--KFEG DTLVNRIELKGMDFKEDG 134 

P42212 . avGFP NI LGHKLE YNYNSHNVYIMADKQKNGI KVNFKI RHNI EDGS VQLADHYQQN- - TP I GDGP 192 

AY015996 rmGFP PVM- Q KT I LGIE P S FEAMYMN - - NGVLVGE VI LVYKLNS GKYYS CHMKTL MKSKGW 190 

AF372525 rrGFP PVM-QQDIVGMQPSYESMYTN- - VTSVIGECI IAFKLQTGKHFTYHMRTV- - - YKSKKPV 187 

AF168419 dsRED PVM-QKKTMGWEASTERLYPR- -DGVLKGEIHKALKLKDGGHYLVEFKSI YMAKKP- 186 

AF322222 asCP562 P ; RDAEQS- 137 

AF168422 asCP562 PVM-KKMTDNWEPSCEKIIPVPKQGILKGDVSMYLLLKDGGRLRCQFDTV YKAKSVP 191 

AF246709 asFP595 PVM- QNKAGRWE PATE I VYEV - -DGVLRGQSLMALKCPGGRHLTCHLHTTYRSKKPASA- 186 

AF322221 asFP499 PVM-QKKTCGWEPSTETVTPR- -DGGLLLRDTPALMLADGGHLSCFMETT YKSKKE- 183 

AF384683 mcGFP PVM- QKKTLKWEPSTEKMYVR- -DGVLKGDVNMALLLEGGGHYRCDFRST YKAKKR- 182 

AF401282 mfGFP PVM-QKKTLKWEPSTEKMYVR- -DGVLKGDVNMALLLEGGGHYRCDFKTT YKAKKF- 185 

AF168424 CSFP484 PVM-QKXTLK^EPSTEIMYVR---DGVLVGDISHSLLLEGG<3HYRCDFKSI YKAECKV- 224 

AF168420 dsFP483 PW-QKKTTGWEPSTERLYPR- -XX3VLIGDIHHALTVEGGGHYACDIKTV YRAKKAA 187 

AY015995 spGFP PVM- QKAILGMEPSFEVVYMN- - SGVLVGEVDLVYKLESGNYYSCHMKTF YRSKGGV 190 

AF168423 ZSFP53 8 PVM- KKMTTNWEAS CEKIMPVPKQGI LKGDVSMYT J iLKPGGRYRCQFDTV YKAKSVP 191 

AF168421 amFP486 PVM-AKKTTGWDPSFBKMTVC — DGILKGDVTAFLMLQGGGNYRCQFHTS YKTK-KP 188 

AY013824 amGFPXM NILGHKLEYNFNSHNVYmPDKZ^GLKV^ -VPLGDGP 192 




P42212 avGFP VLLPDNHYIiSTQSALSKDPNEKRDHMVT 1T1EFVT&AGITHGMDELYK 238 
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AY015996 rmGFP KEFPSYHFIQHRLEKTYV-EDGG- FVEQHETAIAQMTS IGKPLGSLHEWV 23 8 

AF372525 rrGFP ETMPLYHFIQHRLVKTNV-DTASGYVVQHETAIAAHSTIKKIEGSLP 233 

AF168419 dsRED VQLPGYYYVDSKIdDITSHTNEDYTIVEQYERTEGRHHLFL 225 

AF322222 asCP562 RKMG ASHRDTL 148 

AF168422 asCP562 RKMPDWHFIQHKLTREDRSDAKNQKWHLTEHAIASGSALP 231 

AF246709 asFP595 LKMPGFHFEDHRI E IMEE - VEKGKC YKQYEAAVGRYCDAAP S KLGHN 23 2 

AF3 22221 asFP499 VKLPELHFHHLRMEKLNI - SDDWKTVEQHESWAS YS - QVPSKLGHN '228 

AF3 84683 mcGFP VQLPDYHFVDHRIEILSH-DNDYNTVKLSEDAEARYSMLPSQAK 225 

AF401282 mfGFP VQLPDYHFVDHRIEILSH-DKDYMKVKLYEHAEA-HSGLPRQAK 227 

AF168424 CSFP484 VKL PD YHFVDHRI E I LMH - DKDYNKVTL YEN AVAR YS LLP S QA 266 

AF168420 dsFP483 LKMPGYHYVDTKLVI WNN" - DKE FMKVE EHE I AVARHHP F YE P KEZDK 232 

AY015995 spGFP KEFPEYHFIHHRLEKTYV- EEGS - FVEQHETAIAQLTT IGKPLGSLHEWV 238 

AF168423 zsFP53 8 SKMPEWHFIQHKLLREDRSDAKNQKWQLTEHAIAFPSALA 231 

AF168421 amFP486 VTMPPNHWEHRIARTDLDKGGNS-VQLTEHAVAHITSWPF 229 

AY013 824 atnGFPXM VLIPINHYLSTQTAISKDRNETRDHMVFLEFFSACGHTHGMDELYK 23 8 
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SEQUENCE LISTING 

<110> Bioltnage A/S 

<12 0> FLUOROPHORE COMPLEMENTATION PRODUCTS 
<130> 1016PC1- 
<160> 52 

<170> Patentln version 3.1 



<210> 1 

<211> 238 

<212> PRT 

<213> Aequorea victoria 

<400> 1 

Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu Val 
1 5 10 15 

Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Pile He Cys 
35 40 45 

Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 55 60 

Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 
^ ^ 75 80 . - 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 
85 90 95 

Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 " 105 110 

Lys Phe Glu Gly Asp Thr Leu Val Asn Arg lie Glu Leu Lys Gly He 
115 120 125 

Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr Asn 
130 135 140 

Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn Gly 
145 150 155 160 

He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser Val 
165 170 175 

Gin Leu Ala Asp His Tyr Gin Gin Asn Thx Pro He Gly Asp Gly Pro 
180 185 190 



Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
195 200 * 205 
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Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

Thr Ala Ala Gly lie Thr His Gly Met Asp Glu Leu Tyr Lys 
225 . 230 235 



<210> 2 
<211> . 238 
<212> . PRT 

<213> Aequorea victoria 
<400> 2 

Met Ser Lys Gly Glu Glu Leu Pile Thr Gly Val Val Pro He Leu Val 
15 10 15 

Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30' 

Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys 
35 4 0 45 

Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 55 60 

Ser Trp Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 
65 70 75 80 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 
85 90 95 

Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 - 110 

Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 
115 120 125 

Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr Asn 
130 135 140 

Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn Gly 
145 150 155 160 

He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser Val 
165 170 175 

Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly Pro 
180 185 190 

Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala .Leu Ser 
195 200 205 

Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 
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<210> 3 
<211> 238 
<212> " PRT 

<213> Aequorea victoria 
<400> 3 

Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu Val 
15 10 15 

Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys 
35 40 45 

Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 55 60 

Ser His Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 
65 70 75 80 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 
85 90 95 

Thr lie Phe Pile Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 110 

Lys Phe Glu Gly Asp Thr Leu Val Asn Arg lie Glu Leu Lys Gly He 
115 120 125 

Asp- Phe Lys Glu Asp Gly Asn He Leu Gly His Lys. Leu Glu Tyr Asn 
130 135 140 

Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn Gly 
145 150 155 160 

He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser Val 
165 170 175 

Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly Pro 
180 185 190 

Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
195 200 205 

Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230. 235 



<210> 4 

<211> 239 

<212> PRT 

<213> Aequorea victoria 



1016PC1 
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<400> 4 

Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 
15 10 15 

Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 
20 25 30 

Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 
35 40 45 

Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 
50 55 60 

Leu Thr Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys 
65 70 75 80 

Gin His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 
85 90 95 

Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 
100 105 110 

Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly 
115 120 125 

He Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 
130 135 140 

Asn Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn 
145 150 155 160 

Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 
165 170"* 175 

Val Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly 
180 ■ 185 190 

Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu 
195 200 205 

Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 
210 215 220 

Val Thr Ala Ala Gly He Thr Leu Gly Met Asp Glu Leu Tyr Lys 
225 230 235 



<210> 5 
<211> 239 
<212> PRT 

"<213> Aequorea victoria 
<400> 5 

Val Pro He Leu 
15 

Ser Val Ser Gly 



Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val 
1 5 10 

Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe 
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20 25 30 

Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 
35 40 45 

Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 
50 55 60 

Phe Gly Tyr Gly Leu Gin Cys Phe Ala Arg Tyr Pro Asp His Met Lys 
65 70 75 80 

Gin His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 
85 ~ 90 95 

Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 
100 105 110 

Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly 
115 120 125 

He Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 
130 135 140 

Asn Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn 
145 150 155 160 

Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 
165 170 175 

Val Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly 
180 185 190 

Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr. Gin Ser Ala Leu 
195 200 205 

Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 
210 215 220 

Val Thr Ala Ala Gly He Thr Leu Gly Met Asp Glu Leu Tyr Lys 
225 230 235 



<210> 6 

<211> 239 

<212> PRT 

<213> Aequorea victoria 

<400> 6 

Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 
1 5 10 15 

Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 
20 25 30 

Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 
35 40 45 

Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 
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50 55 60 

Leu Gly Tyr Gly Leu Gin Cys Phe Ala Arg Tyr Pro Asp His Met Lys 
65 " " 7 0 75 8 0 

Gin His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 
85 90 • 95 

Arg Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 
100 105 110 

Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg lie Glu Leu Lys Gly 
115 120 125 

He Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 
13 0 13 5 140 

Asn Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn 
145 150 155 160 

Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 
165 170 175 

Val Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly 
180 185 190 

PiO Val UCU jjcd ir-j. kj .rt.&jj Jr\.t3i.x xxj.o x_y J- jjcu j. _y j- 

195 200 205 

Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 
210 215 220 

-Val Thr Ala Ala Gly lie Thr. Leu Gly Met Asp Glu Leu Tyr Lys 
225 230 235 



<210> 7 

<211> 121 

<212> DNA 

<213> Artificial 

<220> 

<221> CDS 

<222> (3) . . (116) 

<223> Constructed sequence 

<400> 7 

cc atg gcc ggt ggt acc ggt tec ggt gec ctg aag aag gag ctg cag 

Met Ala Gly Gly Thr Gly Ser Gly Ala Leu Lys Lys Glu Leu Gin 
1 ~ 5 10 15 

gcc aac aag aag gag ctg gcc cag ctg aag tgg gag ctg cag gcc ctg 
Ala Asn Lys Lvs Glu Leu Ala Gin Leu Lys Trp Glu Leu Gin Ala Leu 
20 25 30 



aag aag gag ctg gcc cag tag gatcc 
Lys Lys Glu Leu Ala Gin 
35 



121 
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7 



<210> 8 

<211> 37 

<212> PRT 

<213> Artificial 

<220> 

<223> Constructed sequence 
<400> 8 

Met Ala Gly Gly Thr Gly Ser Gly Ala Leu Lys Lys Glu Leu Gin Ala 
15 10 15 

Asn Lys Lys Glu Leu Ala Gin Leu Lys Trp Glu Leu Gin Ala Leu Lys 
20 25 30 

Lys Glu Leu Ala Gin 
35 



<210> 9 

<211> 121 

<212> DNA 

<213> Artificial 

<220> 

<221> CDS 

<222> (3) . . (116) 

<223> Constructed sequence 

<400> 9 

cc atg gcc age gag cag ctg gag aag aag ctg cag gec ctg gag aag 47 
Met Ala Ser Glu Gin Leu Glu Lys Lys Leu Gin Ala Leu Glu Lys 
15 10 15 

aag ctg gcc cag ctg gag tgg aag aac cag gcc ctg gag aag aag ctg 95 
Lys Leu Ala Gin Leu Glu Trp Lys Asn Gin Ala Leu Glu Lys Lys Leu 
20 25 30 



gcc cag ggc ggc acc ggt tag gatcc 121 
Ala Gin Gly Gly Thr Gly 
35 



<210> 


10 


<211> 


37 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Constructed 


<400> 


10 



Met Ala Ser Glu Gin Leu Glu Lys Lys Leu Gin Ala Leu Glu Lys Lys 
15 10 15 
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8 



Leu Ala Gin Leu Glu Trp Lys Asn Gin Ala Leu Glu Lys Lys Leu Ala 
20 25 '30 

Gin Gly Gly Thr Gly . 
3 5 



<210> 


11 


<211> 


19 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Constructed 


<400> 


11 



Gly Ser Gly Ser Gly Ser Gly Asp lie Thr Ser Leu Tyr Lys Lys Ala 
1 5 10-15 

Gly Ser Thr 



<210> 


12 


<:2 11^» 


13 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Constructed 


<400> 


12 



Gly Ser Gly Ser Gly Ser Gly Asp lie Thr Ser Leu Tyr Lys Lys Ala 
1 5 10 15 

Gly Ser Thr 



<210> 


13 


<211> 


18 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Constructed 


<400> 


13 



Asp Pro Ala Phe Leu Tyr Lys Val Val lie Ser Gly Ser Gly Ser Gly 
1 5 10 15 



Ser Gly 



1016PC1 
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<210> 14 

<211> 18 

<212> PRT 

<213> Artificial 

<220> 

<223> Constructed sequence 
<400> 14 

Asp Pro Ala.Phe Leu Tyr Lys Val Val lie Ser Gly Ser Gly Ser Gly 
15 10 15 



Ser Gly 



<210> 15 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 

<400> 15 

cctactgctt tgagattcgt egg 23 



<210> 16 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

< 2 2 1 > mi s c_f e a t ur e 

<223> Primer sequence 



<210> 17 

<211> 28 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 

<400> 17 



<400> 



16 



gtcattccag ttttagaagc tc 



22 



cagacaatct gtgtgggcac tegacegg 



28 



1016PC1 



10 
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<210> 18 

<211> 47 

<212> DNA 

<213> Artificial 

<220> 

<221> mis cofeature 

<223> Primer sequence 

<400> 18 

catggccggt ggtaccggtt ccggtgccct gaagaaggag ctgcagg 47 



<210> 19 

<211> 38 

<212> DNA 

<213> Artificial 

<220> 

<221> mis cofeature 

<223> Primer sequence 

<400> 19 

agctccttct tcagggcscc ggasccggta ccaccggc 



<210> 20 

<211> 41 

<212> DNA 

<213> Artificial 

<220> 

<221> misc__f eature 

<223> Primer sequence 

<400> 20 

ccaacaagaa ggagctggcc cagctgaagt gggagctgca g 41 



<210> 


21 


<211> 


40 


<212> 


DNA 


<213> 


Artificial 


<220> 




<221> 


mis cofeature 


<223> 


Primer sequence 


<400> 


21 



ctcccacttc agctgggcca gctccttctt gttggcctgc 40 



<210> 22 
<211> 27 



1016PC1 



11 



<212> DNA 

<213> Artificial 

<220> 

< 2 2 1 > mi s c_f ea tur e 

<223> Primer sequence 

<400> 22 

gccctgaaga aggagctggc ccagtag 

<210> 23 

<211> 37 

<212> DNA 

<213> Artificial 

<220> 

<221> mis cofeature 

<223> Primer sequence 

<400> 23 

gatcctactg ggccagctcc ttcttcaggg cctgcag 



<210> 24 

<211> 40 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_ feature 

<223> Primer sequence 

<400> 24 

catggccagc gagcagctgg agaagaagct gcaggccctg 



<210> 25 

<211> 31 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 

<400> 25 

cctgcagctt cttctccagc tgctcgctgg c 



<210> 26 

<211> 45 

<212> DNA 

<213> Artificial 



©T/DK 02/0G882 

12 

<220> 

<221> misc_f eature 
<223> Primer sequence 

<400> 26 

gagaagaagc tggcccagct ggagtggaag aaccaggccc tggag 45 



<210> 


27 


<211> 


44 


<212> 


DNA 


<213> 


Artificial 


<220> 




<221> 


mi s c_f eature 


<223> 


Primer sequence 


<400> 


27 



ggcctggttc ttccactcca gctgggccag cttcttctcc aggg 44 



<210> 


28 


<211> 


30 


<212> 


DNA 


<213> 


Artificial 


<220> 




<221> 


mis cofeature 


<223> 


Primer sequence 


<400> 


28 



aagaagctgg cccagggcgg caccggttag 



1016PC1 




<210> 29 

<211> 40 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_feature 

<223> Primer sequence 

<400> 29 

gatcctaacc ggtgccgccc tgggccagct tcttctccag 



<210> 30 

<211> 24 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 



1016PC1 

13 



<400> 30 

ggcgccatgg tgagcaaggg cgag 



<210> 31 

<211> 35 

<212> DNA 

<213> Artificial 



<220> 

<221> misc__f eature 
<223> Primer sequence 



<400> 31 

gccggaccgg taccaccgtt gtactccagc ttgtg 



<210> 32 
<211> 33 
<212> DNA 



<213> Artificial 



<220> 

< 2 2 1 > mi s c_f e ature 
<223> Primer sequence 

<400> 32 

gccggaccgg taccaccctg cttgtcggcc atg 



<210> 33 

<211> 36 

<212> DNA 

<213> Artificial 



<220> 

<2 21> mis cofeature 

<223> Primer sequence 

<400> 33 



gccggaccgg taccaccctc gatgttgtgg cggatc 



<210> 34 

<211> 32 

<212> DNA 

<213> Artificial 

<220> 

<221> mis c_f e ature 

<223> Primer sequence 

<400> 34 

ccccggatcc tacttgtaca gctcgtccat gc 



1016PC1 




<210> 35 

<211> 36 

<212> DNA 

<213> Artificial 

<220> 

<221> misc^feature 

<223> Primer sequence 

<400> 35 

ggcgccatgg gcaccggtta caacagccac aacgtc 



<210> 36 

<211> 36 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 

<400> 36 

ggcgccatgg gcaccggtaa gaacggcatc aaggtg 



<210> 


37 


<^11> 


36 


<212> 


DNA 


<213> 


Artificial 


<220> 




<221> 


mi s c_f e atur e 


<223> 


Primer sequence 


<400> 


37 



ggcgccatgg gcaccggtga cggcagcgtg cagctc 



<210> 38 

<211> 34 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_feature 

<223> Primer sequence 

<400> 38 



gggggctagc gccaccatgg tgagcaaggg cgag 



1016PC1 ^ppT/DK 02 /0 0 8 82 

15 

<210> 39 

<211> 57 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 
<223> Primer sequence 

<400> 39 " 

gcgggggatc cgatatcgcc agagccagag ccagagccct cgatgttgtg gcggatc 57 

<210> 40 

<211> 56 

<212> DNA 

<213> Artificial 

<220> 

< 2 2 1 > mi s cofeature 
<223> Primer sequence 

<400> 40 

gggggctagc gatatccggc tctggctctg gctctggcga cggcagcgtg cagctc 56 

<210> 41 

<211> 64 

<212> DNA 

<213> Artificial 

<220> 

<221> misc__f eature 
<2 23> Primer sequence 

<400> 41 

gcccaccctc gtgaccacct tcggctacgg cctgcagtgc ttcgcccgct accccgacca 60 

64 



catg 




<210> 


42 


<211> 


64 


<212> 


DNA 


<213> 


Artificial 


<220> 




<221> 


mi s c__f e ature 


<223> 


Primer sequence 


<400> 


42 



catgtggtcg gggtagcggg cgaagcactg caggccgtag ccgaaggtgg tcacgagggt 60 
gggc 64 



1Q16PC1 ^ ^PCT/DK 02/0 0 80 2 
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*± — > 


<211> 


64 


<212> 


DNA 


<213> 


Artificial 


<220> 




<221> 


mi sc_f eature 


<223> 


Primer sequence 


<400> 


43 


gcccaccctc gtgaccaccc tgggctacgg 


catg 




<210> 


44 


<211> 


64 


<212> 


DNA 


<213> 


Artificial 


<220> 




<221> 


mi sc_f eature 


<223> 


Primer sequence 


<4GG:> 


44 



60 
64 



catgtggtcg gggtagcggg cgaagcactg caggccgtag cccagggtgg tcacgagggt 60 



gggc 64 

<210> 45 

<211> 36 

<212> DNA 

<213> Artificial 

<220> 

< 2 2 1 > mi s c_f e atur e 

<223> Primer sequence 

<400> 45 

gacaaccact acctgagcta ccagtccgcc ctgagc 3 6 



<210> 46 

<211> 36 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 



<400> 46 

gctcagggcg gactggtagc tcaggtagtg gttgtc 



36 



1016PC1 




<210> 47 

<211> 26 

<212> DNA 

<213> Artificial 

<220> 

< 2 2 1 > mis cofeature 

<223> Primer sequence 

<400> 47 

ccaccatggg agtgcaggtg gaaacc 



<210> 48 

<211> 19 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 

<400> 48 

cttccagttt tagaagctc 



<210> 49 

<211> 29 

<212> DNA 

<213> Artificial 

<220> 

<221> mis cofeature 

<223> Primer sequence 

<400> 49 

ccaccatgga gatgtggcat gaaggcctg 



<210> 50 

<211> 25 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_feature 

<223> Primer sequence 

<400> 50 



cctgctttga gattcgtcgg aacac 



1016PC1 
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<210> 51 

<211> 74 

<212> DMA 

<213>- Artificial 

<220> 

<221> misc_f eature 
<223> Primer sequence 

<400> 51 

tcctaggtca gtcctgctcc tcggccacga agtgcactcc taggctgcag cacgtgttga 60 
caattaatca tcgg 74 



<210> 52 

<211> 28 

<212> DNA 

<213> Artificial 

<220> 

<221> misc_f eature 

<223> Primer sequence 



<400> 



52 



cagacaatct gtgtgggcac tcgaccgg 



28 
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